<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2025.1666974</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Cognitive load scale for AI-assisted L2 writing: scale development and validation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author"><name><surname>Yao</surname> <given-names>Guangyuan</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author" corresp="yes"><name><surname>Fan</surname> <given-names>Lingxi</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref><xref ref-type="aff" rid="aff3"><sup>3</sup></xref><xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/3029518/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of English, University of Macau</institution>, <addr-line>Taipa</addr-line>, <country>Macao SAR, China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Hangzhou Innovation Institute, Beihang University</institution>, <addr-line>Hangzhou</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Language Science and Technology, The Hong Kong Polytechnic University</institution>, <addr-line>Hong Kong, Hong Kong SAR</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by" id="fn0001">
<p>Edited by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2086236/overview">Mutasim Al-Deaibes</ext-link>, American University of Sharjah, United Arab Emirates</p>
</fn>
<fn fn-type="edited-by" id="fn0002">
<p>Reviewed by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2159352/overview">Hesham Abdel Karim Aldamen</ext-link>, The University of Jordan, Jordan</p>
<p><ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/3191569/overview">Sebah Al-Ali</ext-link>, American University of Sharjah, United Arab Emirates</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Lingxi Fan, <email>lingxi.fan@connect.polyu.hk</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>30</day>
<month>10</month>
<year>2025</year>
</pub-date>
<pub-date pub-type="collection">
<year>2025</year>
</pub-date>
<volume>16</volume>
<elocation-id>1666974</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>07</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>03</day>
<month>10</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2025 Yao and Fan.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Yao and Fan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>This study developed and validated the Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W), an instrument designed to measure the unique cognitive demands of human-AI collaborative writing. As generative AI becomes integral to second language (L2) composition, understanding its impact on cognitive processes is critical. Using a mixed-methods approach grounded in cognitive writing theory and human-AI interaction research, an initial item pool was refined through expert feedback and interviews. An Exploratory Factor Analysis (<italic>N</italic> =&#x202F;241) on a 35-item draft scale revealed a four-factor structure. A subsequent Confirmatory Factor Analysis (<italic>N</italic> =&#x202F;305) confirmed this structure with excellent model fit. The final 18-item scale measures four distinct dimensions of cognitive load: (1) Prompt Management, (2) Critical Evaluation, (3) Integrative Synthesis, and (4) Authorial Core Processing. The scale demonstrated excellent internal consistency and strong criterion-related validity through significant correlations with writing anxiety, self-efficacy, and perceived mental effort. As the first validated instrument of its kind, the CL-AI-L2W offers a crucial tool for advancing writing theory and informing pedagogy in AI-enhanced learning environments.</p>
</abstract>
<kwd-group>
<kwd>AI-assisted writing</kwd>
<kwd>cognitive load</kwd>
<kwd>second language writing</kwd>
<kwd>scale development</kwd>
<kwd>generative AI</kwd>
<kwd>human-AI interaction</kwd>
</kwd-group>
<counts>
<fig-count count="1"/>
<table-count count="6"/>
<equation-count count="0"/>
<ref-count count="61"/>
<page-count count="12"/>
<word-count count="9617"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Psychology of Language</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="sec1">
<label>1</label>
<title>Introduction</title>
<p>Writing in a second language (L2) is an unequivocally complex and cognitively demanding endeavor (<xref ref-type="bibr" rid="ref18">Granena, 2023</xref>; <xref ref-type="bibr" rid="ref27">Lee, 2005</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>). It requires the simultaneous orchestration of multiple processes, from high-level planning and idea generation to low-level linguistic encoding and transcription (<xref ref-type="bibr" rid="ref19">Hayes, 1996</xref>; <xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>). The inherent difficulty of this task is compounded by a range of cognitive and affective individual differences that mediate performance. Research has consistently shown that cognitive factors, particularly working memory (WM) capacity, play a significant role in a learner&#x2019;s ability to manage the intricate demands of L2 composition (<xref ref-type="bibr" rid="ref3">Baoshu and Chuanbi, 2015</xref>; <xref ref-type="bibr" rid="ref4">Baoshu and Luo, 2012</xref>; <xref ref-type="bibr" rid="ref7">Bergsleithner, 2010</xref>; <xref ref-type="bibr" rid="ref18">Granena, 2023</xref>; <xref ref-type="bibr" rid="ref26">Kormos, 2023</xref>; <xref ref-type="bibr" rid="ref34">Manch&#x00F3;n et al., 2023</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>). Concurrently, affective factors such as writing self-efficacy (<xref ref-type="bibr" rid="ref43">Pajares and Valiante, 2006</xref>), writing anxiety (<xref ref-type="bibr" rid="ref8">Cheng, 2004</xref>), and enjoyment (<xref ref-type="bibr" rid="ref30">Li et al., 2024</xref>) are powerful predictors of writing processes and outcomes. As <xref ref-type="bibr" rid="ref16">Flower and Hayes (1980)</xref> famously metaphorized, the writer is a busy switchboard operator juggling numerous constraints, a challenge that is amplified in an L2 context where linguistic processes are less automatized (<xref ref-type="bibr" rid="ref56">Weigle, 2005</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>).</p>
<p>The landscape of L2 writing is currently undergoing a paradigm shift with the advent and widespread adoption of generative artificial intelligence (AI) tools like ChatGPT and Bing Chat (<xref ref-type="bibr" rid="ref31">Liu et al., 2024</xref>). These technologies are not mere aids for proofreading; they are active partners in the composing process, capable of generating ideas, structuring outlines, drafting text, and creating multimodal content (<xref ref-type="bibr" rid="ref5">Barrot, 2023</xref>; <xref ref-type="bibr" rid="ref31">Liu et al., 2024</xref>; <xref ref-type="bibr" rid="ref49">Su et al., 2023</xref>). This integration fundamentally alters the cognitive ecosystem of writing. The cognitive load, traditionally associated with internal processes of planning, translating, and reviewing (<xref ref-type="bibr" rid="ref17">Flower and Hayes, 1981</xref>; <xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>), is now shared, supplemented, and reshaped by the cognitive demands of human-AI interaction. As <xref ref-type="bibr" rid="ref31">Liu et al. (2024)</xref> demonstrate, new cognitive processes&#x2014;such as prompt engineering, critical output evaluation, and the synthesis of AI-generated text&#x2014;have become central to the writing experience.</p>
<p>While existing research has developed instruments to measure the cognitive load of traditional argumentative writing (e.g., <xref ref-type="bibr" rid="ref29">Li and Wang, 2024</xref>), these scales do not account for the unique cognitive demands imposed by interacting with generative AI. There is a pressing need for a validated measurement tool that can capture this new, hybrid cognitive experience. Understanding the distribution of cognitive load across both traditional writing sub-processes and novel AI-interaction processes is crucial for researchers seeking to model this new form of writing and for educators aiming to develop effective pedagogies for AI-assisted writing (<xref ref-type="bibr" rid="ref13">Deng et al., 2023</xref>).</p>
<p>The integration of AI tools fundamentally alters the cognitive load profile of L2 writing. According to Cognitive Load Theory (CLT) (<xref ref-type="bibr" rid="ref53">Sweller et al., 1998</xref>), effective learning occurs when cognitive resources are optimally managed. While AI has the potential to reduce the intrinsic cognitive load associated with linguistic production, it may also introduce new forms of extraneous and germane cognitive load related to human-AI interaction. Understanding this new cognitive architecture is essential for developing effective pedagogy. However, a validated instrument to measure these distinct facets of cognitive load in AI-assisted writing is currently lacking. Therefore, the present study aims to address this critical gap by developing and validating the Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W). Following the rigorous methodological precedents for scale development in the field (e.g., <xref ref-type="bibr" rid="ref8">Cheng, 2004</xref>; <xref ref-type="bibr" rid="ref29">Li and Wang, 2024</xref>), this study employs a mixed-methods approach to generate items, establish a robust factor structure, and ensure the scale is a reliable and valid instrument for future research and pedagogical application.</p>
</sec>
<sec id="sec2">
<label>2</label>
<title>Literature review</title>
<sec id="sec3">
<label>2.1</label>
<title>Cognitive demands and processes in writing</title>
<p>Writing in a second language (L2) is a profoundly complex cognitive task that imposes significant demands on a learner&#x2019;s limited mental resources (<xref ref-type="bibr" rid="ref18">Granena, 2023</xref>; <xref ref-type="bibr" rid="ref26">Kormos, 2023</xref>). Foundational cognitive models conceptualize writing as a non-linear, problem-solving activity comprising recursive processes of planning (generating ideas), translating (converting ideas into text), and reviewing (evaluating and revising) (<xref ref-type="bibr" rid="ref17">Flower and Hayes, 1981</xref>; <xref ref-type="bibr" rid="ref19">Hayes, 1996</xref>). The successful orchestration of these processes relies heavily on working memory (WM), a limited-capacity system responsible for the temporary storage and manipulation of information (<xref ref-type="bibr" rid="ref1">Baddeley and Hitch, 1974</xref>). As <xref ref-type="bibr" rid="ref22">Kellogg&#x2019;s (1996)</xref> model specifies, the central executive component of WM must coordinate attentional resources to manage content, structure, and audience considerations simultaneously, making writing one of the most demanding tasks for WM (<xref ref-type="bibr" rid="ref35">McCutchen, 2000</xref>; <xref ref-type="bibr" rid="ref38">Olive, 2004</xref>).</p>
<p>This cognitive burden is intensified in an L2 context. L2 learners&#x2019; linguistic processes, such as lexical retrieval and grammatical encoding, are often less automatized and more effortful (<xref ref-type="bibr" rid="ref6">Bereiter, 1980</xref>; <xref ref-type="bibr" rid="ref47">Scardamalia, 1981</xref>; <xref ref-type="bibr" rid="ref59">Zimmermann, 2000</xref>). Consequently, a substantial portion of their limited WM capacity is consumed by lower-level concerns, leaving fewer resources for higher-level processes like argumentation and organization (<xref ref-type="bibr" rid="ref23">Kellogg, 2001</xref>; <xref ref-type="bibr" rid="ref56">Weigle, 2005</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>). This phenomenon is effectively explained by Cognitive Load Theory (CLT), which provides a framework for understanding how WM limitations affect learning and performance (<xref ref-type="bibr" rid="ref50">Sweller, 1988</xref>; <xref ref-type="bibr" rid="ref53">Sweller et al., 1998</xref>).</p>
<p>CLT distinguishes between three types of cognitive load. Intrinsic cognitive load is the inherent difficulty determined by the complexity of the writing task itself&#x2014;the number of interacting elements a writer must process simultaneously, such as developing a thesis, organizing paragraphs, and selecting appropriate vocabulary (<xref ref-type="bibr" rid="ref51">Sweller, 2010</xref>; <xref ref-type="bibr" rid="ref54">Sweller et al., 2019</xref>). Extraneous cognitive load is generated by suboptimal instructional design or task conditions that consume mental resources without contributing to learning, such as unclear prompts or distracting interfaces (<xref ref-type="bibr" rid="ref40">Paas et al., 2003</xref>). Finally, germane cognitive load refers to the effortful mental work involved in processing information and constructing long-term schemas, which is essential for developing writing skills (<xref ref-type="bibr" rid="ref52">Sweller, 2011</xref>). In L2 writing, the high intrinsic load of the task can easily lead to cognitive overload, a state where the total cognitive demand exceeds the capacity of WM (<xref ref-type="bibr" rid="ref21">Jiang and Kalyuga, 2022</xref>).</p>
<p>While WM capacity and cognitive load are central, a constellation of individual differences mediates their effects on writing performance (<xref ref-type="bibr" rid="ref25">Kormos, 2012</xref>, <xref ref-type="bibr" rid="ref26">2023</xref>). Affective factors, in particular, play a crucial role (<xref ref-type="bibr" rid="ref36">McLeod, 1987</xref>). Writing anxiety, a skill-specific apprehension, has been consistently linked to poorer performance (<xref ref-type="bibr" rid="ref8">Cheng, 2004</xref>; <xref ref-type="bibr" rid="ref9">Cheng et al., 1999</xref>; <xref ref-type="bibr" rid="ref10">Choi, 2014</xref>; <xref ref-type="bibr" rid="ref12">Daly and Miller, 1975</xref>; <xref ref-type="bibr" rid="ref15">Faigley et al., 1981</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>), as it may consume WM resources with intrusive thoughts and worries. Conversely, writing self-efficacy&#x2014;one&#x2019;s belief in their ability to write successfully&#x2014;is a robust positive predictor of effort, persistence, and outcomes (<xref ref-type="bibr" rid="ref2">Bandura, 1997</xref>; <xref ref-type="bibr" rid="ref24">Klassen, 2003</xref>; <xref ref-type="bibr" rid="ref41">Pajares, 2003</xref>; <xref ref-type="bibr" rid="ref43">Pajares and Valiante, 2006</xref>; <xref ref-type="bibr" rid="ref44">Prat-Sala and Redford, 2012</xref>; <xref ref-type="bibr" rid="ref48">Schunk, 2003</xref>). These factors are intertwined, with self-efficacy often mediating the negative effects of anxiety (<xref ref-type="bibr" rid="ref42">Pajares and Johnson, 1994</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>). Understanding this interplay between cognitive and affective factors is essential for creating a complete picture of the L2 writing experience.</p>
</sec>
<sec id="sec4">
<label>2.2</label>
<title>Measuring cognitive load in writing</title>
<p>Given its theoretical importance, accurately measuring the cognitive load experienced during writing is a key methodological challenge. Early approaches often relied on unidimensional, subjective self-report scales, such as <xref ref-type="bibr" rid="ref39">Paas&#x2019;s (1992)</xref> single-item, 9-point scale measuring perceived mental effort. While useful for gauging overall task difficulty (<xref ref-type="bibr" rid="ref45">R&#x00E9;v&#x00E9;sz et al., 2016</xref>; <xref ref-type="bibr" rid="ref46">Robinson, 2001</xref>), such measures are insufficient for diagnosing the specific sources of cognitive strain. They cannot distinguish, for example, whether a writer&#x2019;s high cognitive load stems from difficulties with planning, linguistic expression, or revision (<xref ref-type="bibr" rid="ref37">N&#x00FC;ckles et al., 2020</xref>).</p>
<p>Recognizing this limitation, recent research has moved toward developing multidimensional instruments. The work of <xref ref-type="bibr" rid="ref29">Li and Wang (2024)</xref> in developing the EFL Argumentative Writing Cognitive Load Scale (EFL-AWCLS) represents a significant advancement and a methodological blueprint for the present study. By grounding their item generation in both cognitive writing theory (<xref ref-type="bibr" rid="ref20">Hayes, 2012</xref>; <xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>) and qualitative data from learners, they developed a reliable and valid scale that captures distinct dimensions of cognitive load, such as argumentation, organization, and language expression. Their work demonstrates the necessity and feasibility of creating a nuanced, multidimensional tool to pinpoint where learners allocate their cognitive resources during traditional writing tasks. Other research has similarly applied a cognitive load perspective to understand the effects of instructional support, such as scaffolding with graphic organizers (<xref ref-type="bibr" rid="ref28">Lee and Tan, 2010</xref>) or collaborative writing tasks (<xref ref-type="bibr" rid="ref21">Jiang and Kalyuga, 2022</xref>), further validating CLT as a powerful framework for writing research.</p>
</sec>
<sec id="sec5">
<label>2.3</label>
<title>The reconfiguration of cognitive load in AI-assisted writing</title>
<p>The models and measurement tools discussed above were developed for a pre-AI writing environment. The recent integration of generative AI tools like ChatGPT fundamentally reconfigures the cognitive processes and the distribution of cognitive load in writing (<xref ref-type="bibr" rid="ref31">Liu et al., 2024</xref>). These tools are not passive aids; they are active partners that can generate ideas, draft text, and structure arguments, creating a new, hybrid cognitive ecosystem where cognitive responsibilities are distributed between the human writer and the AI system (<xref ref-type="bibr" rid="ref58">Zhao, 2022</xref>).</p>
<p>This partnership introduces entirely new, cognitively demanding activities into the writing process. Based on their insightful qualitative investigation, <xref ref-type="bibr" rid="ref31">Liu et al. (2024)</xref> identified several novel sources of cognitive load:</p>
<p>Prompt management: The writer&#x2019;s task shifts from solely generating ideas to crafting, refining, and iterating effective prompts to guide the AI. This iterative, metacognitive process represents a significant investment of mental effort.</p>
<p>Critical Evaluation: The writer must serve as a critical gatekeeper of AI-generated content. This involves a heavy cognitive load related to fact-checking for AI &#x201C;hallucinations&#x201D; (<xref ref-type="bibr" rid="ref33">Lund et al., 2023</xref>), identifying potential bias (<xref ref-type="bibr" rid="ref32">Lucy and Bamman, 2021</xref>), and assessing the relevance and stylistic appropriateness of the output.</p>
<p>Integrative synthesis: The writer must blend AI-generated text with their own. This requires substantial effort to paraphrase for academic integrity (<xref ref-type="bibr" rid="ref11">Cotton et al., 2023</xref>), maintain a coherent authorial voice, and logically connect disparate pieces of information.</p>
<p>These new processes interact with traditional ones in a complex reallocation of cognitive resources. From a CLT perspective (<xref ref-type="bibr" rid="ref52">Sweller, 2011</xref>; <xref ref-type="bibr" rid="ref54">Sweller et al., 2019</xref>), AI can potentially reduce intrinsic load by handling complex sentence construction, but it can also introduce significant extraneous load if its output is inaccurate or irrelevant, forcing the writer to expend effort on evaluation and correction. The effort spent on learning how to prompt the AI effectively and synthesize its output can be seen as a form of germane load&#x2014;a productive investment in building new human-AI collaboration skills.</p>
</sec>
<sec id="sec6">
<label>2.4</label>
<title>The present study</title>
<p>The preceding review highlights a significant and pressing gap: the absence of a psychometrically validated instrument designed to measure the multifaceted cognitive load inherent in the AI-assisted L2 writing process. While qualitative work has provided rich initial insights into the novel cognitive activities involved (<xref ref-type="bibr" rid="ref31">Liu et al., 2024</xref>), the field lacks a quantitative tool to systematically examine the distribution, antecedents, and consequences of this reconfigured cognitive load. The development of a dedicated, multidimensional scale is a critical next step to advance both cognitive writing theory and evidence-based pedagogy in the age of AI.</p>
<p>Accordingly, this study undertakes the development and validation of the Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W). To guide this process, the study is structured around the following research questions:</p>
<list list-type="simple">
<list-item>
<p>RQ1: What are the underlying dimensions (or factors) of cognitive load experienced by second language (L2) learners during an AI-assisted writing task?</p>
</list-item>
<list-item>
<p>RQ2: To what extent is the newly developed Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W) a reliable and valid instrument for measuring this construct?</p>
</list-item>
</list>
</sec>
</sec>
<sec sec-type="methods" id="sec7">
<label>3</label>
<title>Method</title>
<p>This study employed a sequential mixed-methods research design to develop and validate the Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W). The research was conducted in three major phases, following established best practices in scale development (<xref ref-type="bibr" rid="ref14">DeVellis, 2017</xref>; <xref ref-type="bibr" rid="ref29">Li and Wang, 2024</xref>): (1) Item generation and content validity assessment, (2) A pilot study for item refinement and Exploratory Factor Analysis (EFA), and (3) A main study for Confirmatory Factor Analysis (CFA) and further validity and reliability testing.</p>
<sec id="sec8">
<label>3.1</label>
<title>Item generation and content validity</title>
<p>The initial item pool was generated through a two-pronged approach to ensure both theoretical grounding and ecological validity.</p>
<p>First, the item development process was guided by a clear theoretical framework to ensure comprehensive coverage of the construct. Drawing from the literature review, we identified five <italic>a priori</italic> theoretical domains expected to constitute the cognitive load in AI-assisted L2 writing. Two domains represented traditional writing processes, informed by the models of <xref ref-type="bibr" rid="ref17">Flower and Hayes (1981)</xref>, <xref ref-type="bibr" rid="ref22">Kellogg (1996)</xref>, and the dimensions of the EFL-AWCLS (<xref ref-type="bibr" rid="ref29">Li and Wang, 2024</xref>). Three domains represented novel AI-interaction processes, derived from the qualitative findings of <xref ref-type="bibr" rid="ref31">Liu et al. (2024)</xref>.</p>
<p>Based on this five-domain framework (<xref ref-type="table" rid="tab1">Table 1</xref>), we generated an initial pool of 48 items. Items for the traditional domains were adapted from existing literature and the EFL-AWCLS, while items for the AI-interaction domains were developed based on the specific cognitive activities described by <xref ref-type="bibr" rid="ref31">Liu et al. (2024)</xref>. To ensure the items were grounded in learners&#x2019; authentic experiences, we conducted semi-structured interviews with a small, purposive sample of 12 L2 learners (intermediate to advanced proficiency) who had experience using generative AI for writing. Participants were asked to complete a short AI-assisted writing task and then describe the mental effort they invested in different parts of the process. The language and concepts they used were incorporated into the item wording.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Five-domain framework for the initial pool of items.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Domain number</th>
<th align="left" valign="top">Domain name</th>
<th align="left" valign="top">Core focus</th>
<th align="left" valign="top">Category</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">1</td>
<td align="left" valign="top">Planning and organization</td>
<td align="left" valign="top">Foundational stages of writing: outlining, structuring, and organizing ideas.</td>
<td align="left" valign="top">Traditional writing processes</td>
</tr>
<tr>
<td align="left" valign="top">2</td>
<td align="left" valign="top">Language expression and revision</td>
<td align="left" valign="top">Crafting and refining text for clarity, style, and grammatical correctness.</td>
<td align="left" valign="top">Traditional writing processes</td>
</tr>
<tr>
<td align="left" valign="top">3</td>
<td align="left" valign="top">Prompt engineering and management</td>
<td align="left" valign="top">Formulating, iterating, and managing instructions for AI tools.</td>
<td align="left" valign="top">AI interaction</td>
</tr>
<tr>
<td align="left" valign="top">4</td>
<td align="left" valign="top">Critical output evaluation</td>
<td align="left" valign="top">Critically assessing AI-generated content for accuracy, bias, and relevance.</td>
<td align="left" valign="top">Ai interaction</td>
</tr>
<tr>
<td align="left" valign="top">5</td>
<td align="left" valign="top">Integration and synthesis</td>
<td align="left" valign="top">Combining AI-generated content with original writing to create a cohesive final product.</td>
<td align="left" valign="top">AI interaction and writing</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>This process resulted in an initial pool of 48 items, each framed as a statement about the mental effort required for a specific activity (e.g., &#x201C;How much mental effort did it take you to design effective prompts for the AI?&#x201D;). All items were measured on a 7-point Likert scale, ranging from 1 (very, very low mental effort) to 7 (very, very high mental effort), consistent with established cognitive load measurement (<xref ref-type="bibr" rid="ref39">Paas, 1992</xref>).</p>
<p>The initial 48-item pool was submitted to a panel of six experts for content validity assessment. The panel comprised three associate professors specializing in L2 writing and psycholinguistics, and three doctoral candidates whose research focuses on AI in language education. The experts were asked to evaluate each item based on its relevance (is it relevant to the construct of AI-assisted writing cognitive load?) and clarity (is the wording unambiguous?). They provided both quantitative ratings and qualitative feedback. Items that received low ratings for relevance or were flagged as unclear by more than two experts were revised or eliminated. This process reduced the item pool to a refined set of 35 items for the pilot study.</p>
</sec>
<sec id="sec9">
<label>3.2</label>
<title>Pilot study and exploratory factor analysis (EFA)</title>
<p>A total of 258 L2 learners were recruited from a university in China to participate in the pilot study. After removing incomplete responses, the final sample for EFA consisted of 241 participants (155 female, 86 male). Their ages ranged from 18 to 23 (M&#x202F;=&#x202F;20.1, SD&#x202F;=&#x202F;1.5). All were non-English majors who had passed the CET-4 or CET-6, indicating an intermediate-to-high intermediate English proficiency level. All participants reported having used generative AI tools (e.g., ChatGPT, Bing Chat, Deepseek) for academic tasks prior to the study. Data was collected online. Participants were first given a standardized set of instructions and a 10-min tutorial on using Deepseek for an argumentative writing task. They were then presented with an argumentative writing prompt (&#x201C;Should universities invest more in AI-based educational tools?&#x201D;) and given 40&#x202F;min to write a 250-word essay using Deepseek as an assistant. Interactions with the AI tool were conducted on the public DeepSeek V3.1 interface (accessed in <ext-link xlink:href="https://chat.deepseek.com/" ext-link-type="uri">https://chat.deepseek.com/</ext-link>). Participants were explicitly instructed not to enter any personal or identifiable information into the AI chat prompt. To further protect privacy, the AI chat logs were not collected or logged by the research team. The study&#x2019;s data collection was limited to the final written essays produced by the participants and their questionnaire responses. This approach ensured that no direct interactions with the AI system were stored, safeguarding participant privacy.</p>
<p>Immediately after completing the task, they were directed to a questionnaire containing the 35-item draft scale. Prior to the main analyses, the data from both samples were screened for accuracy, missing values, outliers, and assumptions of normality. Data entry accuracy was verified through a random check of 10% of the cases. The rate of missing data was minimal (&#x003C;1%) and handled using pairwise deletion, which is appropriate for a low volume of missingness. Multivariate outliers were assessed using Mahalanobis distance (<italic>p</italic> &#x003C;&#x202F;0.001); no cases were identified as significant outliers requiring removal. Finally, the assumptions of univariate normality were checked by examining skewness and kurtosis values for all 18 final scale items. All values fell within the acceptable range of &#x2212;2 to +2, indicating that the data did not significantly deviate from a normal distribution.</p>
<p>To address RQ1, an Exploratory Factor Analysis (EFA) was conducted using SPSS (Version 28). First, item analysis was performed, and items with corrected item-total correlations below 0.40 were removed. The suitability of the data for factor analysis was confirmed using the Kaiser-Meyer-Olkin (KMO) measure and Bartlett&#x2019;s Test of Sphericity.</p>
<p>A Principal Axis Factoring (PAF) with Oblimin rotation was chosen, as the underlying factors of cognitive load were theoretically expected to be correlated. The number of factors to retain was determined by multiple criteria: (a) parallel analysis, (b) the Kaiser criterion (eigenvalues &#x003E; 1), (c) examination of the scree plot, and (d) the theoretical interpretability of the resulting factors. The parallel analysis, which is considered the most robust method, clearly suggested a four-factor solution. Items were retained if their primary factor loading was above 0.40 and they did not exhibit significant cross-loadings (i.e., a loading &#x003E; 0.30 on a secondary factor).</p>
</sec>
<sec id="sec10">
<label>3.3</label>
<title>Main study and confirmatory factor analysis (CFA)</title>
<p>A second, independent sample of 312 L2 learners was recruited from a different university to avoid sample overlap. After data screening, the final sample for the main study comprised 305 participants (198 female, 107 male), with a similar demographic profile to Sample 1 (Age: M&#x202F;=&#x202F;20.5, SD&#x202F;=&#x202F;1.7; intermediate-to-high intermediate English proficiency).</p>
<p>In addition to the refined CL-AI-L2W scale derived from the EFA, the following established instruments were administered to assess criterion-related validity:</p>
<p>Subjective Mental Effort Scale: A single-item, 9-point scale adapted from <xref ref-type="bibr" rid="ref39">Paas (1992)</xref> to measure overall perceived task difficulty, used for convergent validity.</p>
<p>Second Language Writing Anxiety Inventory (SLWAI): The 22-item scale developed by <xref ref-type="bibr" rid="ref8">Cheng (2004)</xref> to measure writing apprehension.</p>
<p>Writing Self-Efficacy Scale: A 10-item subscale adapted from <xref ref-type="bibr" rid="ref43">Pajares and Valiante (2006)</xref> measuring learners&#x2019; confidence in their ability to perform writing tasks.</p>
<p>The procedure for the main study was identical to that of the pilot study. After completing the AI-assisted argumentative writing task, participants were directed to a questionnaire booklet. This booklet always began with the final 18-item CL-AI-L2W scale, followed by the three validation scales (Second Language Writing Anxiety Inventory, Writing Self-Efficacy Scale, and the single-item mental effort scale). To minimize potential order effects among the validation scales, their presentation order was counterbalanced across participants using a Latin square design. Specifically, six possible orderings of the three validation scales were created, and participants were randomly assigned to one of these six versions of the questionnaire booklet.</p>
<p>To address RQ2, a comprehensive set of analyses was conducted to establish the reliability and validity of the CL-AI-L2W. All analyses, unless otherwise specified, were performed using SPSS (Version 28) and Mplus (Version 8.8). First, a Confirmatory Factor Analysis (CFA) was performed on the data from Sample 2 to test the four-factor structure identified in the EFA. Given the 7-point ordinal nature of the Likert-scale items, we employed the Weighted Least Squares Mean and Variance Adjusted (WLSMV) estimator, which is robust for such data. Model fit was evaluated against established criteria: &#x03C7;<sup>2</sup>/df&#x202F;&#x003C;&#x202F;3, CFI&#x202F;&#x003E;&#x202F;0.95, TLI&#x202F;&#x003E;&#x202F;0.95, RMSEA &#x003C; 0.06, and WRMR &#x003C; 1.0 (<xref ref-type="bibr" rid="ref9001">Hu and Bentler, 1999</xref>; <xref ref-type="bibr" rid="ref9003">Yu, 2002</xref>). Second, we assessed construct validity in detail. Convergent validity was examined by calculating the Average Variance Extracted (AVE), and discriminant validity was tested using the Fornell-Larcker criterion and the Heterotrait-Monotrait Ratio of Correlations (HTMT). Third, internal consistency reliability was assessed using both Cronbach&#x2019;s alpha and McDonald&#x2019;s omega (<italic>&#x03C9;</italic>) coefficients, with values above 0.70 considered acceptable. Measurement invariance of the scale was also tested across gender and the two study samples (EFA vs. CFA) to ensure its psychometric equivalence across groups. Finally, criterion-related validity was examined through Pearson correlation analyses, investigating the relationships between the CL-AI-L2W scores and the scores from the other validated scales (overall mental effort, writing anxiety, and writing self-efficacy). It was hypothesized that the CL-AI-L2W would show a strong positive correlation with mental effort, a moderate positive correlation with writing anxiety, and a moderate negative correlation with writing self-efficacy (see <xref ref-type="table" rid="tab2">Table 2</xref>).</p>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption>
<p><italic>A priori</italic> item blueprint and distribution across stages.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top" char="&#x00D7;">Domain</th>
<th align="char" valign="top" char="&#x00D7;">Definition (summary)</th>
<th align="char" valign="top" char="&#x00D7;">Initial pool (48)</th>
<th align="char" valign="top" char="&#x00D7;">Post&#x2013;expert review (35)</th>
<th align="char" valign="top" char="&#x00D7;">Final validated scale (18)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Planning and organization (PO)</td>
<td align="left" valign="top">Deciding argument, outlining, structuring</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">7</td>
<td align="left" valign="top">2 (merged into Authorial Core Processing)</td>
</tr>
<tr>
<td align="left" valign="top">Prompt management (PM)</td>
<td align="left" valign="top">Designing and refining prompts for AI</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">6</td>
<td align="left" valign="top">5</td>
</tr>
<tr>
<td align="left" valign="top">Critical evaluation (CE)</td>
<td align="left" valign="top">Evaluating AI outputs for accuracy, relevance, bias, style</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">7</td>
<td align="left" valign="top">5</td>
</tr>
<tr>
<td align="left" valign="top">Integration and synthesis (IS)</td>
<td align="left" valign="top">Paraphrasing and blending AI with own text</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">6</td>
<td align="left" valign="top">4</td>
</tr>
<tr>
<td align="left" valign="top">Language expression and revision (LER)</td>
<td align="left" valign="top">Vocabulary, grammar, coherence, revision</td>
<td align="center" valign="top">13</td>
<td align="center" valign="top">9</td>
<td align="left" valign="top">2 (merged into Authorial Core Processing)</td>
</tr>
<tr>
<td align="left" valign="top">Total</td>
<td align="left" valign="top">&#x2013;</td>
<td align="center" valign="top">48</td>
<td align="center" valign="top">35</td>
<td align="left" valign="top">18</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec sec-type="results" id="sec11">
<label>4</label>
<title>Results</title>
<sec id="sec12">
<label>4.1</label>
<title>Exploratory factor analysis</title>
<p>To answer RQ1, an EFA was performed on the data from the pilot study (Sample 1, <italic>N</italic>&#x202F;=&#x202F;241) to identify the underlying dimensions of cognitive load in AI-assisted L2 writing.</p>
<p>First, item analysis was conducted on the initial 35 items. Five items were removed due to low corrected item-total correlations (&#x003C; 0.40). These items, along with their correlation values, were: Item 34 (&#x201C;Concentrate on the writing task without getting distracted&#x201D;): r&#x202F;=&#x202F;0.31; Item 33 (&#x201C;Manage your time effectively&#x201D;): r&#x202F;=&#x202F;0.34; Item 29 (&#x201C;Focus on spelling and punctuation&#x201D;): r&#x202F;=&#x202F;0.35; Item 5 (&#x201C;Ensure a smooth and logical flow between paragraphs&#x201D;): r&#x202F;=&#x202F;0.38; Item 13 (&#x201C;Manage the conversation with the AI&#x201D;): r&#x202F;=&#x202F;0.39.</p>
<p>The remaining 30 items were subjected to Principal Axis Factoring (PAF). The suitability of the data for factor analysis was confirmed, with a high Kaiser-Meyer-Olkin (KMO) value of 0.92 and a significant Bartlett&#x2019;s Test of Sphericity (&#x03C7;<sup>2</sup>(435)&#x202F;=&#x202F;3854.21, <italic>p</italic>&#x202F;&#x003C;&#x202F;0.001). The PAF with Oblimin rotation initially yielded five factors with eigenvalues greater than 1. However, the scree plot clearly showed an elbow after the fourth factor, and the fifth factor was weak and difficult to interpret. Therefore, a four-factor solution was specified, which was theoretically more coherent and parsimonious. During this process, a further 12 items were removed because they either had primary factor loadings below 0.40 or exhibited significant cross-loadings (&#x003E; 0.32) on more than one factor.</p>
<p>The final EFA resulted in a clean and interpretable four-factor structure comprising 18 items, which collectively explained 71.84% of the total variance. The factor loadings for each subscale are presented in <xref ref-type="table" rid="tab3">Table 3</xref>. All items loaded strongly on their respective factors (ranging from 0.69 to 0.87), and all subscales demonstrated excellent internal consistency (<italic>&#x03B1;</italic>&#x202F;&#x2265;&#x202F;0.85).</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption>
<p>EFA results and item status.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Item no.</th>
<th align="left" valign="top">Item content summary</th>
<th align="left" valign="top">Factor loadings (F1, F2, F3, F4)</th>
<th align="left" valign="top">Status and rationale for decision</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">10</td>
<td align="left" valign="top">Rephrase/refine prompts when AI&#x2019;s answer was not helpful.</td>
<td align="left" valign="top">0.87, 0.11, 0.09, 0.03</td>
<td align="left" valign="top">Retained: strong, clean loading on F1 (PM)</td>
</tr>
<tr>
<td align="left" valign="top">8</td>
<td align="left" valign="top">Figure out the best way to phrase initial questions.</td>
<td align="left" valign="top">0.85, 0.15, 0.12, 0.08</td>
<td align="left" valign="top">Retained: strong, clean loading on F1 (PM)</td>
</tr>
<tr>
<td align="left" valign="top">12</td>
<td align="left" valign="top">Ask effective follow-up questions.</td>
<td align="left" valign="top">0.81, 0.18, 0.10, 0.05</td>
<td align="left" valign="top">Retained: strong, clean loading on F1 (PM)</td>
</tr>
<tr>
<td align="left" valign="top">9</td>
<td align="left" valign="top">Think of specific keywords to guide the AI.</td>
<td align="left" valign="top">0.79, 0.13, 0.08, 0.11</td>
<td align="left" valign="top">Retained: strong, clean loading on F1 (PM)</td>
</tr>
<tr>
<td align="left" valign="top">11</td>
<td align="left" valign="top">Break down a complex task into smaller prompts.</td>
<td align="left" valign="top">0.75, 0.09, 0.16, 0.14</td>
<td align="left" valign="top">Retained: strong, clean loading on F1 (PM)</td>
</tr>
<tr>
<td align="left" valign="top">15</td>
<td align="left" valign="top">Evaluate if AI suggestions were relevant to your argument.</td>
<td align="left" valign="top">0.12, 0.86, 0.14, 0.07</td>
<td align="left" valign="top">Retained: Strong, clean loading on F2 (CE)</td>
</tr>
<tr>
<td align="left" valign="top">17</td>
<td align="left" valign="top">Decide which parts of AI output to use and which to ignore.</td>
<td align="left" valign="top">0.10, 0.83, 0.21, 0.10</td>
<td align="left" valign="top">Retained: Strong, clean loading on F2 (CE)</td>
</tr>
<tr>
<td align="left" valign="top">14</td>
<td align="left" valign="top">Judge if AI information was factually accurate.</td>
<td align="left" valign="top">0.08, 0.81, 0.11, 0.05</td>
<td align="left" valign="top">Retained: Strong, clean loading on F2 (CE)</td>
</tr>
<tr>
<td align="left" valign="top">16</td>
<td align="left" valign="top">Assess the tone and style of the AI text.</td>
<td align="left" valign="top">0.14, 0.77, 0.19, 0.13</td>
<td align="left" valign="top">Retained: Strong, clean loading on F2 (CE)</td>
</tr>
<tr>
<td align="left" valign="top">18</td>
<td align="left" valign="top">Check the AI text for potential bias.</td>
<td align="left" valign="top">0.06, 0.72, 0.09, 0.08</td>
<td align="left" valign="top">Retained: Strong, clean loading on F2 (CE)</td>
</tr>
<tr>
<td align="left" valign="top">22</td>
<td align="left" valign="top">Blend AI text smoothly with your own writing.</td>
<td align="left" valign="top">0.11, 0.18, 0.85, 0.15</td>
<td align="left" valign="top">Retained: Strong, clean loading on F3 (IS)</td>
</tr>
<tr>
<td align="left" valign="top">21</td>
<td align="left" valign="top">Paraphrase or rewrite AI sentences in your own words.</td>
<td align="left" valign="top">0.09, 0.15, 0.82, 0.12</td>
<td align="left" valign="top">Retained: Strong, clean loading on F3 (IS)</td>
</tr>
<tr>
<td align="left" valign="top">24</td>
<td align="left" valign="top">Connect your own ideas logically with AI ideas.</td>
<td align="left" valign="top">0.13, 0.20, 0.80, 0.19</td>
<td align="left" valign="top">Retained: Strong, clean loading on F3 (IS)</td>
</tr>
<tr>
<td align="left" valign="top">23</td>
<td align="left" valign="top">Ensure your personal authorial voice was not lost.</td>
<td align="left" valign="top">0.07, 0.12, 0.74, 0.25</td>
<td align="left" valign="top">Retained: Strong, clean loading on F3 (IS)</td>
</tr>
<tr>
<td align="left" valign="top">3</td>
<td align="left" valign="top">Create a logical structure or outline for the essay.</td>
<td align="left" valign="top">0.09, 0.10, 0.17, 0.84</td>
<td align="left" valign="top">Retained: Strong, clean loading on F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">1</td>
<td align="left" valign="top">Decide on the main argument or position for your essay.</td>
<td align="left" valign="top">0.11, 0.08, 0.11, 0.80</td>
<td align="left" valign="top">Retained: Strong, clean loading on F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">28</td>
<td align="left" valign="top">Construct grammatically correct English sentences.</td>
<td align="left" valign="top">0.05, 0.06, 0.14, 0.73</td>
<td align="left" valign="top">Retained: Strong, clean loading on F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">27</td>
<td align="left" valign="top">Find the right vocabulary to express your ideas precisely.</td>
<td align="left" valign="top">0.08, 0.10, 0.12, 0.69</td>
<td align="left" valign="top">Retained: Strong, clean loading on F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">20</td>
<td align="left" valign="top">Identify awkward phrasing in AI text.</td>
<td align="left" valign="top">0.15, 0.51, 0.28, 0.43</td>
<td align="left" valign="top">Removed (EFA): Significant cross-loading on F2 (CE) and F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">30</td>
<td align="left" valign="top">Revise sentences you wrote yourself.</td>
<td align="left" valign="top">0.08, 0.13, 0.44, 0.49</td>
<td align="left" valign="top">Removed (EFA): Significant cross-loading on F3 (IS) and F4 (ACP)</td>
</tr>
<tr>
<td align="left" valign="top">31</td>
<td align="left" valign="top">Review entire essay for overall coherence.</td>
<td align="left" valign="top">0.18, 0.29, 0.39, 0.46</td>
<td align="left" valign="top">Removed (EFA): Significant cross-loading and primary loading is weak (&#x003C;0.50)</td>
</tr>
<tr>
<td align="left" valign="top">4</td>
<td align="left" valign="top">Organize arguments within each paragraph.</td>
<td align="left" valign="top">0.06, 0.11, 0.25, 0.42</td>
<td align="left" valign="top">Removed (EFA): Conceptually redundant with Item 3; weaker loading</td>
</tr>
<tr>
<td align="left" valign="top">25</td>
<td align="left" valign="top">Maintain consistent flow between your text and AI&#x2019;s.</td>
<td align="left" valign="top">0.10, 0.22, 0.53, 0.28</td>
<td align="left" valign="top">Removed (EFA): Redundant with Item 22; weaker loading</td>
</tr>
<tr>
<td align="left" valign="top">32</td>
<td align="left" valign="top">Ensure final text answered the prompt.</td>
<td align="left" valign="top">0.21, 0.41, 0.28, 0.25</td>
<td align="left" valign="top">Removed (EFA): Redundant with Item 15; weaker loading</td>
</tr>
<tr>
<td align="left" valign="top">35</td>
<td align="left" valign="top">Monitor overall logic of your argument.</td>
<td align="left" valign="top">0.19, 0.26, 0.23, 0.48</td>
<td align="left" valign="top">Removed (EFA): Redundant with Item 1; weaker loading</td>
</tr>
<tr>
<td align="left" valign="top">7</td>
<td align="left" valign="top">Think about the intro and conclusion.</td>
<td align="left" valign="top">0.15, 0.18, 0.30, 0.38</td>
<td align="left" valign="top">Removed (EFA): Primary loading &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">6</td>
<td align="left" valign="top">Decide what info you needed to find/generate.</td>
<td align="left" valign="top">0.28, 0.25, 0.15, 0.37</td>
<td align="left" valign="top">Removed (EFA): Primary loading &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">2</td>
<td align="left" valign="top">Come up with initial ideas on your own.</td>
<td align="left" valign="top">0.12, 0.09, 0.21, 0.35</td>
<td align="left" valign="top">Removed (EFA): Primary loading &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">19</td>
<td align="left" valign="top">Compare different responses from the AI.</td>
<td align="left" valign="top">0.25, 0.31, 0.38, 0.19</td>
<td align="left" valign="top">Removed (EFA): Primary loading &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">26</td>
<td align="left" valign="top">Synthesize info from multiple AI responses.</td>
<td align="left" valign="top">0.21, 0.28, 0.39, 0.22</td>
<td align="left" valign="top">Removed (EFA): Primary loading &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">13</td>
<td align="left" valign="top">Manage the conversation with the AI.</td>
<td align="left" valign="top">N/A</td>
<td align="left" valign="top">Removed (Item Analysis): Corrected item-total correlation &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">5</td>
<td align="left" valign="top">Ensure a smooth and logical flow between paragraphs.</td>
<td align="left" valign="top">N/A</td>
<td align="left" valign="top">Removed (Item Analysis): Corrected item-total correlation &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">29</td>
<td align="left" valign="top">Focus on spelling and punctuation.</td>
<td align="left" valign="top">N/A</td>
<td align="left" valign="top">Removed (Item Analysis): Corrected item-total correlation &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">33</td>
<td align="left" valign="top">Manage your time effectively.</td>
<td align="left" valign="top">N/A</td>
<td align="left" valign="top">Removed (Item Analysis): Corrected item-total correlation &#x003C; 0.40</td>
</tr>
<tr>
<td align="left" valign="top">34</td>
<td align="left" valign="top">Concentrate on the writing task without getting distracted.</td>
<td align="left" valign="top">N/A</td>
<td align="left" valign="top">Removed (Item Analysis): Corrected item-total correlation &#x003C; 0.40</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Principal Axis Factoring with Oblimin rotation was performed on 30 items. Item numbers correspond to the draft scale in <xref ref-type="supplementary-material" rid="SM1">Appendix A</xref>. Removal criteria: (1) Corrected item-total correlation &#x003C; 0.40; (2) EFA primary loading &#x003C; 0.40 or a cross-loading &#x003E; 0.32.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec13">
<label>4.2</label>
<title>Confirmatory factor analysis (CFA)</title>
<p>To further test the four-factor structure of the CL-AI-L2W identified in the EFA (RQ2), a CFA was conducted on the data from the main study (Sample 2, N&#x202F;=&#x202F;305). Following the methodological recommendations for ordinal data, the model was estimated using the Weighted Least Squares Mean and Variance Adjusted (WLSMV) estimator. The results demonstrated an excellent fit of the hypothesized four-factor model to the data. The goodness-of-fit indices were robust: &#x03C7;<sup>2</sup>(129)&#x202F;=&#x202F;265.82, <italic>p</italic> &#x003C;&#x202F;0.001; &#x03C7;<sup>2</sup>/df&#x202F;=&#x202F;2.06; CFI&#x202F;=&#x202F;0.97; TLI&#x202F;=&#x202F;0.96; RMSEA&#x202F;=&#x202F;0.059 (90% CI&#x202F;=&#x202F;[0.050, 0.068]); and WRMR&#x202F;=&#x202F;0.95. All these indices met or exceeded the stringent criteria for good model fit (e.g., CFI/TLI&#x202F;&#x003E;&#x202F;0.95, RMSEA &#x003C; 0.06, WRMR &#x003C; 1.0), providing strong empirical support for the four-factor structure of the scale.</p>
<p>As shown in the final model (see <xref ref-type="fig" rid="fig1">Figure 1</xref>), all standardized factor loadings were statistically significant (<italic>p</italic> &#x003C;&#x202F;0.001) and substantial, ranging from 0.71 to 0.89. This indicates that all 18 items are strong and reliable indicators of their respective latent constructs. The correlations between the four latent factors were moderate to strong, ranging from r&#x202F;=&#x202F;0.52 (between Prompt Management and Authorial Core Processing) to r&#x202F;=&#x202F;0.73 (between Critical Evaluation and Integrative Synthesis). These correlations confirm that the factors are distinct yet related components of the overarching construct of cognitive load in AI-assisted writing, justifying the use of an oblique rotation in the EFA. The final CFA model is depicted in <xref ref-type="fig" rid="fig1">Figure 1</xref>.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>Standardized path diagram of the four-factor CFA model for the CL-AI-L2W.</p>
</caption>
<graphic xlink:href="fpsyg-16-1666974-g001.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Diagram depicting relationships among four main constructs: Prompt Management, Integrative Synthesis, Critical Evaluation, and Authorial Core Processing. Each construct is connected by arrows with correlation values, indicating strength of relationships. Prompt Management links to five elements (PM1-PM5); Integrative Synthesis to four (IS1-IS4); Critical Evaluation to five (CE1-CE5); and Authorial Core Processing to four (ACP1-ACP4). Correlation values between constructs vary from 0.52 to 0.73, showcasing interdependencies and influences among them.</alt-text>
</graphic>
</fig>
</sec>
<sec id="sec14">
<label>4.3</label>
<title>Reliability and criterion-related validity</title>
<p>The reliability for the overall 18-item CL-AI-L2W scale was excellent, with a Cronbach&#x2019;s alpha of 0.94. The subscale reliabilities, as reported in <xref ref-type="table" rid="tab3">Table 3</xref>, were also high (PM: 0.91; CE: 0.89; IS: 0.88; ACP: 0.85).</p>
<p>To establish criterion-related validity, Pearson correlations were calculated between the CL-AI-L2W (total and subscale scores) and the other measures administered in the main study. Descriptive statistics and the correlation matrix are presented in <xref ref-type="table" rid="tab4">Table 4</xref>.</p>
<table-wrap position="float" id="tab4">
<label>Table 4</label>
<caption>
<p>Descriptive statistics and Pearson correlations among variables.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Variable</th>
<th align="center" valign="top">M</th>
<th align="center" valign="top">SD</th>
<th align="center" valign="top">1</th>
<th align="center" valign="top">2</th>
<th align="center" valign="top">3</th>
<th align="center" valign="top">4</th>
<th align="center" valign="top">5</th>
<th align="center" valign="top">6</th>
<th align="center" valign="top">7</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">1. CL-AI-L2W total</td>
<td align="center" valign="middle">4.31</td>
<td align="center" valign="middle">1.15</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">2. Prompt management</td>
<td align="center" valign="middle">4.55</td>
<td align="center" valign="middle">1.30</td>
<td align="center" valign="middle">0.81&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">3. Critical evaluation</td>
<td align="center" valign="middle">4.81</td>
<td align="center" valign="middle">1.25</td>
<td align="center" valign="middle">0.85&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.65&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">4. Integrative synthesis</td>
<td align="center" valign="middle">4.40</td>
<td align="center" valign="middle">1.28</td>
<td align="center" valign="middle">0.83&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.61&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.71&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">5. Authorial core processing</td>
<td align="center" valign="middle">3.48</td>
<td align="center" valign="middle">1.21</td>
<td align="center" valign="middle">0.76&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.48&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.55&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.58&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">6. Overall mental effort</td>
<td align="center" valign="middle">6.52</td>
<td align="center" valign="middle">1.45</td>
<td align="center" valign="middle">0.72</td>
<td align="center" valign="middle">0.60&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.68&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.64&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.51&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
<td/>
</tr>
<tr>
<td align="left" valign="middle">7. Writing anxiety</td>
<td align="center" valign="middle">3.15</td>
<td align="center" valign="middle">0.98</td>
<td align="center" valign="middle">0.45</td>
<td align="center" valign="middle">0.38&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.49&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.41&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.35&#x002A;&#x002A;</td>
<td align="center" valign="middle">0.48&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2013;</td>
</tr>
<tr>
<td align="left" valign="middle">8. Writing self-efficacy</td>
<td align="center" valign="middle">3.88</td>
<td align="center" valign="middle">1.05</td>
<td align="center" valign="middle">&#x2212;0.51</td>
<td align="center" valign="middle">&#x2212;0.42&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2212;0.55&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2212;0.47&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2212;0.40&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2212;0.53&#x002A;&#x002A;</td>
<td align="center" valign="middle">&#x2212;0.62&#x002A;&#x002A;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>&#x002A;<italic>p</italic>&#x202F;&#x003C;&#x202F;0.05, &#x002A;&#x002A;<italic>p</italic>&#x202F;&#x003C;&#x202F;0.01.</p>
</table-wrap-foot>
</table-wrap>
<p>As hypothesized, the total CL-AI-L2W score showed a strong, positive correlation with the single-item overall mental effort scale (r&#x202F;=&#x202F;0.72, <italic>p</italic>&#x202F;&#x003C;&#x202F;0.01), supporting its convergent validity. Furthermore, the total score was moderately and positively correlated with writing anxiety (r&#x202F;=&#x202F;0.45, <italic>p</italic>&#x202F;&#x003C;&#x202F;0.01) and moderately and negatively correlated with writing self-efficacy (r&#x202F;=&#x202F;&#x2212;0.51, <italic>p</italic>&#x202F;&#x003C;&#x202F;0.01). These significant correlations provide strong evidence for the criterion-related validity of the new scale. The final 18-item CL-AI-L2W is presented in <xref ref-type="supplementary-material" rid="SM1">Appendix B</xref>.</p>
<p>A series of analyses were conducted to establish the reliability and validity of the CL-AI-L2W. <xref ref-type="table" rid="tab5">Table 5</xref> presents a summary of the descriptive statistics, reliability coefficients, and validity assessments. The internal consistency of the subscales was excellent. As shown in <xref ref-type="table" rid="tab5">Table 5</xref>, both Cronbach&#x2019;s alpha (<italic>&#x03B1;</italic>) and McDonald&#x2019;s omega (<italic>&#x03C9;</italic>) coefficients for all four factors were well above the recommended 0.80 threshold, ranging from 0.87 to 0.93. Convergent validity was strongly supported, with the Average Variance Extracted (AVE) for each factor exceeding the 0.50 criterion (ranging from 0.63 to 0.70). This indicates that, on average, more than 63% of the variance in the items was accounted for by their respective latent construct. Furthermore, the strong and significant factor loadings reported in the CFA (Section 4.2) provide additional evidence for convergent validity. We assessed discriminant validity using two rigorous methods. First, following the Fornell-Larcker criterion, the square root of the AVE for each construct was greater than its correlation with any other construct, providing initial support for discriminant validity. Second, we calculated the Heterotrait-Monotrait Ratio of Correlations (HTMT). All HTMT values were well below the conservative threshold of 0.85, ranging from 0.59 (PM-ACP) to 0.81 (CE-IS), offering strong evidence that the four factors are empirically distinct constructs.</p>
<table-wrap position="float" id="tab5">
<label>Table 5</label>
<caption>
<p>Descriptive statistics, reliability, and validity assessment for the CL-AI-L2W subscales.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Variable</th>
<th align="center" valign="top">M</th>
<th align="center" valign="top">SD</th>
<th align="center" valign="top">&#x03C9;</th>
<th align="center" valign="top">&#x03B1;</th>
<th align="center" valign="top">AVE</th>
<th align="center" valign="top">1</th>
<th align="center" valign="top">2</th>
<th align="center" valign="top">3</th>
<th align="center" valign="top">4</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">1. Prompt management</td>
<td align="center" valign="middle">4.55</td>
<td align="center" valign="middle">1.30</td>
<td align="center" valign="middle">0.92</td>
<td align="center" valign="middle">0.91</td>
<td align="center" valign="middle">0.68</td>
<td align="center" valign="middle">0.82</td>
<td/>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">2. Critical evaluation</td>
<td align="center" valign="middle">4.81</td>
<td align="center" valign="middle">1.25</td>
<td align="center" valign="middle">0.93</td>
<td align="center" valign="middle">0.92</td>
<td align="center" valign="middle">0.70</td>
<td align="center" valign="middle">0.67</td>
<td align="center" valign="middle">0.84</td>
<td/>
<td/>
</tr>
<tr>
<td align="left" valign="middle">3. Integrative synthesis</td>
<td align="center" valign="middle">4.40</td>
<td align="center" valign="middle">1.28</td>
<td align="center" valign="middle">0.90</td>
<td align="center" valign="middle">0.89</td>
<td align="center" valign="middle">0.67</td>
<td align="center" valign="middle">0.63</td>
<td align="center" valign="middle">0.73</td>
<td align="center" valign="middle">0.82</td>
<td/>
</tr>
<tr>
<td align="left" valign="middle">4. Authorial core processing</td>
<td align="center" valign="middle">3.48</td>
<td align="center" valign="middle">1.21</td>
<td align="center" valign="middle">0.88</td>
<td align="center" valign="middle">0.87</td>
<td align="center" valign="middle">0.63</td>
<td align="center" valign="middle">0.52</td>
<td align="center" valign="middle">0.58</td>
<td align="center" valign="middle">0.61</td>
<td align="center" valign="middle">0.79</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>N</italic> =&#x202F;305. M, Mean; SD, Standard Deviation; &#x03C9;, McDonald&#x2019;s Omega; &#x03B1;, Cronbach&#x2019;s Alpha; AVE, Average Variance Extracted. The diagonal elements in bold are the square roots of the AVE. Off-diagonal elements are the latent factor correlations derived from the CFA model. For discriminant validity, diagonal values should be greater than the off-diagonal values in their respective rows and columns.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec15">
<label>4.4</label>
<title>Measurement invariance</title>
<p>To ensure that the CL-AI-L2W functions equivalently across different subgroups, we conducted multi-group CFA to test for measurement invariance across gender (male vs. female) and the two study samples (EFA sample vs. CFA sample). We tested a sequence of nested models for configural, metric (factor loadings), and scalar (intercepts) invariance. The results are summarized in <xref ref-type="table" rid="tab6">Table 6</xref>. As shown in <xref ref-type="table" rid="tab6">Table 6</xref>, for both gender and sample comparisons, all models demonstrated excellent fit to the data. Crucially, the change in the Comparative Fit Index (&#x0394;CFI) between nested models was consistently minimal. For gender, the &#x0394;CFI was 0.002 for both metric and scalar invariance. For the sample comparison, the &#x0394;CFI was 0.001 for metric and 0.002 for scalar invariance. As all &#x0394;CFI values were well below the recommended cutoff of 0.01, strong evidence for scalar invariance was established. This indicates that the scale&#x2019;s factor structure, item loadings, and item intercepts are equivalent across these groups, supporting the validity of comparing latent mean scores between genders and across the two samples in future research.</p>
<table-wrap position="float" id="tab6">
<label>Table 6</label>
<caption>
<p>Measurement invariance testing across gender and sample.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Model</th>
<th align="center" valign="top">&#x03C7;<sup>2</sup></th>
<th align="center" valign="top">df</th>
<th align="center" valign="top">CFI</th>
<th align="center" valign="top">TLI</th>
<th align="center" valign="top">RMSEA [90% CI]</th>
<th align="center" valign="top">&#x0394;&#x03C7;<sup>2</sup></th>
<th align="center" valign="top">&#x0394;df</th>
<th align="center" valign="top">&#x0394;CFI</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle" colspan="9">Invariance across gender (male vs. female)</td>
</tr>
<tr>
<td align="left" valign="middle">M1: Configural</td>
<td align="center" valign="middle">485.21</td>
<td align="center" valign="middle">258</td>
<td align="center" valign="middle">0.972</td>
<td align="center" valign="middle">0.966</td>
<td align="center" valign="middle">0.055 [0.048, 0.062]</td>
<td align="center" valign="middle">&#x2013;</td>
<td align="center" valign="middle">&#x2013;</td>
<td align="center" valign="middle">&#x2013;</td>
</tr>
<tr>
<td align="left" valign="middle">M2: Metric</td>
<td align="center" valign="middle">499.86</td>
<td align="center" valign="middle">272</td>
<td align="center" valign="middle">0.970</td>
<td align="center" valign="middle">0.965</td>
<td align="center" valign="middle">0.054 [0.047, 0.061]</td>
<td align="center" valign="middle">14.65</td>
<td align="center" valign="middle">14</td>
<td align="center" valign="middle">0.002</td>
</tr>
<tr>
<td align="left" valign="middle">M3: Scalar</td>
<td align="center" valign="middle">515.03</td>
<td align="center" valign="middle">286</td>
<td align="center" valign="middle">0.968</td>
<td align="center" valign="middle">0.964</td>
<td align="center" valign="middle">0.053 [0.046, 0.060]</td>
<td align="center" valign="middle">15.17</td>
<td align="center" valign="middle">14</td>
<td align="center" valign="middle">0.002</td>
</tr>
<tr>
<td align="left" valign="middle" colspan="9">Invariance across sample (EFA vs. CFA)</td>
</tr>
<tr>
<td align="left" valign="middle">M4: Configural</td>
<td align="center" valign="middle">510.19</td>
<td align="center" valign="middle">258</td>
<td align="center" valign="middle">0.975</td>
<td align="center" valign="middle">0.970</td>
<td align="center" valign="middle">0.045 [0.039, 0.051]</td>
<td align="center" valign="middle">-</td>
<td align="center" valign="middle">-</td>
<td align="center" valign="middle">-</td>
</tr>
<tr>
<td align="left" valign="middle">M5: Metric</td>
<td align="center" valign="middle">523.88</td>
<td align="center" valign="middle">272</td>
<td align="center" valign="middle">0.974</td>
<td align="center" valign="middle">0.970</td>
<td align="center" valign="middle">0.044 [0.038, 0.050]</td>
<td align="center" valign="middle">13.69</td>
<td align="center" valign="middle">14</td>
<td align="center" valign="middle">0.001</td>
</tr>
<tr>
<td align="left" valign="middle">M6: Scalar</td>
<td align="center" valign="middle">540.25</td>
<td align="center" valign="middle">286</td>
<td align="center" valign="middle">0.972</td>
<td align="center" valign="middle">0.968</td>
<td align="center" valign="middle">0.044 [0.038, 0.050]</td>
<td align="center" valign="middle">16.37</td>
<td align="center" valign="middle">14</td>
<td align="center" valign="middle">0.002</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>N</italic> =&#x202F;546. CFI, Comparative Fit Index; TLI, Tucker-Lewis Index; RMSEA, Root Mean Square Error of Approximation; CI, Confidence Interval. &#x0394;CFI values &#x2264; 0.01 between nested models support invariance.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec sec-type="discussion" id="sec16">
<label>5</label>
<title>Discussion</title>
<p>The primary purpose of this study was to develop and validate the first psychometrically sound instrument, the Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W), to measure the multifaceted cognitive demands faced by L2 learners in the new era of generative AI. The rigorous, multi-phase methodology yielded an 18-item, four-factor scale with excellent reliability and strong evidence of validity. The findings not only address the critical gap identified in the literature but also provide novel insights into the evolving cognitive architecture of L2 writing.</p>
<p>The first research question sought to identify the underlying dimensions of cognitive load in AI-assisted L2 writing. The Exploratory Factor Analysis, confirmed by the subsequent CFA, revealed a clear and theoretically coherent four-factor structure: (1) Prompt Management, (2) Critical Evaluation, (3) Integrative Synthesis, and (4) Authorial Core Processing. This structure provides the first quantitative evidence for the cognitive reconfiguration of the writing process previously described in qualitative research.</p>
<p>The emergence of Prompt Management, Critical Evaluation, and Integrative Synthesis as distinct and robust factors empirically validates the foundational qualitative work of <xref ref-type="bibr" rid="ref31">Liu et al. (2024)</xref>. While their study identified these new processes descriptively, the present study demonstrates that they represent quantifiable and separate sources of cognitive load. Notably, Critical Evaluation emerged as the dimension with the highest mean score (M&#x202F;=&#x202F;4.81), suggesting that the most mentally demanding task for L2 learners is not generating text, but acting as a critical gatekeeper for AI-generated content. This involves a heavy cognitive investment in assessing relevance, accuracy, and style, a finding that underscores the importance of developing students&#x2019; critical AI literacy (<xref ref-type="bibr" rid="ref33">Lund et al., 2023</xref>; <xref ref-type="bibr" rid="ref55">Walters and Wilder, 2023</xref>).</p>
<p>The Integrative Synthesis factor captures the cognitive effort required to blend AI output with one&#x2019;s own writing, a process that requires maintaining authorial voice and ensuring coherence (<xref ref-type="bibr" rid="ref11">Cotton et al., 2023</xref>). Together, these three AI-centric factors illustrate a fundamental shift: cognitive load is no longer solely an internal phenomenon related to memory retrieval and sentence generation but is now heavily situated in the interactive, dialogic space between the writer and the AI. From a working memory perspective, these three AI-centric factors appear to impose a heavy burden primarily on the central executive (<xref ref-type="bibr" rid="ref1">Baddeley and Hitch, 1974</xref>; <xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>). Critical Evaluation and Integrative Synthesis, in particular, require the central executive to perform several demanding functions simultaneously: constantly switching attention between the AI&#x2019;s output and one&#x2019;s own mental model of the text, inhibiting irrelevant or inaccurate AI suggestions, and continuously updating the writing plan. This constant monitoring and decision-making process is a hallmark of executive control and explains why these factors emerged as significant sources of cognitive load. Prompt Management also taxes the central executive, as it involves goal setting, planning a sequence of queries, and monitoring the effectiveness of the human-AI dialogue.</p>
<p>Perhaps the most revealing finding is the fourth factor, Authorial Core Processing. This dimension combines high-level planning (e.g., deciding on an argument, creating a structure) and core linguistic encoding (e.g., grammar, lexis), which are central to traditional writing models (<xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>; <xref ref-type="bibr" rid="ref29">Li and Wang, 2024</xref>). The fact that these traditional elements clustered together into a single, distinct factor suggests that even with AI assistance, a fundamental core of authorial responsibility remains. However, the relatively lower mean score for this factor (M&#x202F;=&#x202F;3.48) compared to the AI-interaction factors is highly significant. It provides empirical support for the hypothesis that AI tools offload a substantial portion of the cognitive burden traditionally associated with planning and translating (<xref ref-type="bibr" rid="ref17">Flower and Hayes, 1981</xref>), thereby freeing up cognitive resources that are immediately reallocated to the new, demanding tasks of prompting, evaluating, and integrating. Interpreted through the lens of working memory, the lower cognitive load on Authorial Core Processing suggests that AI assistance may offload some of the demands typically placed on WM&#x2019;s subsidiary systems. For instance, AI&#x2019;s ability to quickly generate grammatically correct sentences and suggest vocabulary could reduce the burden on the phonological loop, which is heavily involved in linguistic encoding (<xref ref-type="bibr" rid="ref22">Kellogg, 1996</xref>). Similarly, AI&#x2019;s capacity to help structure an outline might lessen the strain on the visuospatial sketchpad during planning. This offloading appears to free up limited central executive resources, which, as our data show, are then immediately reallocated to the novel and highly demanding tasks of managing the human-AI interaction. This finding empirically illustrates a critical reallocation of cognitive resources within the writer&#x2019;s working memory system, a shift from internal content generation to external tool management and evaluation.</p>
<p>The second research question concerned the reliability and validity of the new scale. The results provide compelling evidence that the 18-item CL-AI-L2W is a robust and trustworthy instrument. The excellent internal consistency of the overall scale (<italic>&#x03B1;</italic>&#x202F;=&#x202F;0.94) and its subscales (&#x03B1; ranging from 0.85 to 0.91) indicates that the items within each factor reliably measure a single underlying construct.</p>
<p>The confirmatory factor analysis strongly supported the four-factor model, with all goodness-of-fit indices meeting or exceeding stringent criteria (<xref ref-type="bibr" rid="ref9001">Hu and Bentler, 1999</xref>). This confirms that the structure identified in the EFA is not a statistical artifact of one sample but is a stable representation of the construct. The criterion-related validity analyses further strengthen the case for the scale&#x2019;s utility. The strong positive correlation with a global measure of mental effort (<xref ref-type="bibr" rid="ref39">Paas, 1992</xref>) provides convergent validity, showing that the CL-AI-L2W indeed measures cognitive load.</p>
<p>Crucially, the scale behaves as expected within the broader nomological network of writing psychology. The moderate positive correlation with writing anxiety (r&#x202F;=&#x202F;0.45) aligns with established research showing that cognitively demanding tasks can exacerbate anxiety (<xref ref-type="bibr" rid="ref8">Cheng, 2004</xref>; <xref ref-type="bibr" rid="ref57">Zabihi, 2018</xref>). Conversely, the moderate negative correlation with writing self-efficacy (r&#x202F;=&#x202F;&#x2212;0.51) supports the notion that learners who feel more confident in their abilities perceive the writing task as less mentally burdensome (<xref ref-type="bibr" rid="ref43">Pajares and Valiante, 2006</xref>). These relationships demonstrate that the CL-AI-L2W is not only measuring cognitive load in isolation but is also meaningfully connected to the key affective factors that mediate the L2 writing experience.</p>
<p>The findings of this study carry significant implications for both theoretical understanding and pedagogical application. From a theoretical perspective, the validated four-factor structure of the CL-AI-L2W invites a reconsideration of existing cognitive models of writing. While classical models such as those of <xref ref-type="bibr" rid="ref17">Flower and Hayes (1981)</xref> and <xref ref-type="bibr" rid="ref22">Kellogg (1996)</xref> continue to provide valuable insights into the core processes of authorship, they are increasingly inadequate in capturing the distributed, interactive nature of AI-assisted composition. The results of this study suggest the need to conceptualize AI-assisted writing as a hybrid cognitive ecosystem in which the cognitive load is dynamically distributed between the writer&#x2019;s internal cognitive resources and the external cognitive affordances provided by AI systems. Within this framework, the CL-AI-L2W serves as a diagnostic tool to empirically map how cognitive responsibilities are allocated and managed during AI-mediated writing.</p>
<p>Pedagogically, the implications of this model are both immediate and actionable. The CL-AI-L2W can function as an effective instrument for diagnosing specific areas of cognitive strain that students encounter in the process of AI-assisted composition. By administering the scale, educators can identify whether learners experience the greatest difficulty in formulating effective prompts, critically evaluating AI-generated content, integrating and synthesizing information, or managing core authorial processes. Such insights enable the design of targeted instructional interventions tailored to students&#x2019; specific cognitive challenges. For instance, elevated scores in the dimension of prompt management would indicate the need to strengthen students&#x2019; proficiency in formulating clear, effective queries and engaging in productive dialogue with AI systems. Similarly, high scores in critical evaluation underscore the urgency of cultivating students&#x2019; digital literacy skills, including the ability to assess the credibility, bias, and stylistic appropriateness of AI-generated outputs. Where integrative synthesis scores are elevated, instructional focus may need to be placed on helping students paraphrase, summarize, and integrate information while maintaining a coherent and authentic authorial voice.</p>
<p>At the same time, the relatively lower cognitive load associated with authorial core processing offers both promise and caution. On one hand, this pattern may reflect the supportive role that AI can play in helping students overcome linguistic barriers, allowing them to allocate more cognitive resources to higher-order thinking and organization. On the other hand, there is a risk that essential skills related to argument construction, critical reasoning, and language production may be underdeveloped or gradually atrophied due to overreliance on AI-generated content. Consequently, while AI tools can enhance certain dimensions of the writing process, educators must remain attentive to the need for preserving and fostering core writing competencies to ensure that learners do not become passive recipients of machine-generated text, but remain active, critical, and creative agents in the composition process.</p>
</sec>
<sec sec-type="conclusions" id="sec17">
<label>6</label>
<title>Conclusion</title>
<p>The advent of generative AI represents a paradigm shift in L2 writing, fundamentally altering the cognitive demands of the composition process. This study successfully developed and validated the first Cognitive Load Scale for AI-assisted L2 Writing (CL-AI-L2W), an 18-item, four-factor instrument that is both reliable and valid. The scale reveals that the cognitive load in this new environment is a hybrid construct, comprising the novel demands of Prompt Management, Critical Evaluation, and Integrative Synthesis, alongside the enduring demands of Authorial Core Processing. By providing the field with a robust tool to measure this complex construct, this study lays the groundwork for a new generation of research aimed at understanding, modeling, and ultimately improving L2 writing pedagogy in the age of artificial intelligence.</p>
<p>While this study makes a significant contribution, several limitations should be acknowledged. First, the participants were Chinese university students of intermediate-to-high proficiency. Future research should validate the CL-AI-L2W with learners from different L1 backgrounds, proficiency levels, and educational contexts to establish its broader generalizability. Second, the study focused on a single argumentative writing task using one AI system. The distribution of cognitive load is likely to vary across different genres and tasks (e.g., creative writing, summary writing). Finally, for convergent validity, we employed a single-item measure of overall mental effort. While widely used, the psychometric properties of single-item measures, such as their reliability, cannot be assessed with the same rigor as multi-item scales. Future studies could incorporate additional measures or employ a test&#x2013;retest design to further strengthen the validity evidence.</p>
<p>The development of the CL-AI-L2W opens up numerous avenues for future research. Researchers can now move beyond description to systematic, quantitative investigation. For instance, experimental studies could use the scale as an outcome measure to compare the effectiveness of different AI training interventions. Longitudinal studies could track how the cognitive load profile of learners changes as they gain more expertise in human-AI collaboration. Finally, future studies could correlate the CL-AI-L2W subscale scores with objective measures of the writing process (e.g., keystroke logging, revision patterns) and product (e.g., CAF measures) to build a more comprehensive model of AI-assisted L2 writing.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="sec18">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec sec-type="author-contributions" id="sec19">
<title>Author contributions</title>
<p>GY: Writing &#x2013; original draft, Writing &#x2013; review &#x0026; editing. LF: Writing &#x2013; review &#x0026; editing, Writing &#x2013; original draft.</p>
</sec>
<sec sec-type="funding-information" id="sec20">
<title>Funding</title>
<p>The author(s) declare that no financial support was received for the research and/or publication of this article.</p>
</sec>
<sec sec-type="COI-statement" id="sec21">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="ai-statement" id="sec22">
<title>Generative AI statement</title>
<p>The authors declare that no Gen AI was used in the creation of this manuscript.</p>
<p>Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.</p>
</sec>
<sec sec-type="disclaimer" id="sec23">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="sec24">
<title>Supplementary material</title>
<p>The Supplementary material for this article can be found online at: <ext-link xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyg.2025.1666974/full#supplementary-material" ext-link-type="uri">https://www.frontiersin.org/articles/10.3389/fpsyg.2025.1666974/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Supplementary_file_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Baddeley</surname> <given-names>A. D.</given-names></name> <name><surname>Hitch</surname> <given-names>G.</given-names></name></person-group> (<year>1974</year>). &#x201C;<article-title>Working memory</article-title>&#x201D; in <source>The psychology of learning and motivation: advances in research and theory</source>. ed. <person-group person-group-type="editor"><name><surname>Bower</surname> <given-names>G. H.</given-names></name></person-group>, vol. <volume>8</volume> (<publisher-name>Academic Press</publisher-name>), <fpage>47</fpage>&#x2013;<lpage>89</lpage>.</citation></ref>
<ref id="ref2"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Bandura</surname> <given-names>A.</given-names></name></person-group> (<year>1997</year>). <source>Self-efficacy: the exercise of control</source>. <publisher-loc>Worth</publisher-loc>.</citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baoshu</surname> <given-names>Y.</given-names></name> <name><surname>Chuanbi</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>Planning and working memory effects on L2 performance in Chinese EFL learners' argumentative writing</article-title>. <source>Indones. J. Appl. Linguist.</source> <volume>5</volume>, <fpage>44</fpage>&#x2013;<lpage>53</lpage>. doi: <pub-id pub-id-type="doi">10.17509/ijal.v5i1.830</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baoshu</surname> <given-names>Y.</given-names></name> <name><surname>Luo</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>The effect of working memory capacity on written language production of second language learners</article-title>. <source>Foreign Lang. Teach. Res.</source> <volume>44</volume>, <fpage>536</fpage>&#x2013;<lpage>546</lpage>.</citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barrot</surname> <given-names>J. S.</given-names></name></person-group> (<year>2023</year>). <article-title>Using chatgpt for second language writing: pitfalls and potentials</article-title>. <source>Assess. Writing</source> <volume>57</volume>:<fpage>100745</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.asw.2023.100745</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Bereiter</surname> <given-names>C.</given-names></name></person-group> (<year>1980</year>). &#x201C;<article-title>Development in writing</article-title>&#x201D; in <source>Cognitive processes in writing</source>. eds. <person-group person-group-type="editor"><name><surname>Gregg</surname> <given-names>L.</given-names></name> <name><surname>Steinberg</surname> <given-names>E.</given-names></name></person-group> (<publisher-loc>New York, USA</publisher-loc>: <publisher-name>Lawrence Erlbaum</publisher-name>), <fpage>73</fpage>&#x2013;<lpage>93</lpage>.</citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bergsleithner</surname> <given-names>J. M.</given-names></name></person-group> (<year>2010</year>). <article-title>Working memory capacity and L2 writing performance</article-title>. <source>Ci&#x00EA;ncias Cogni&#x00E7;&#x00E3;o</source> <volume>15</volume>, <fpage>2</fpage>&#x2013;<lpage>20</lpage>.</citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>Y.-S.</given-names></name></person-group> (<year>2004</year>). <article-title>A measure of second language writing anxiety: scale development and preliminary validation</article-title>. <source>J. Second. Lang. Writ.</source> <volume>13</volume>, <fpage>313</fpage>&#x2013;<lpage>335</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jslw.2004.07.001</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>Y. S.</given-names></name> <name><surname>Horwitz</surname> <given-names>E. K.</given-names></name> <name><surname>Schallert</surname> <given-names>D.</given-names></name></person-group> (<year>1999</year>). <article-title>Language anxiety: differentiating writing and speaking components</article-title>. <source>Lang. Learn.</source> <volume>49</volume>, <fpage>417</fpage>&#x2013;<lpage>446</lpage>. doi: <pub-id pub-id-type="doi">10.1111/0023-8333.00095</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Choi</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <article-title>Language anxiety in second language learning: is it really a stumbling block?</article-title> <source>Sec. Lang. Stud.</source> <volume>31</volume>, <fpage>1</fpage>&#x2013;<lpage>42</lpage>.</citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cotton</surname> <given-names>D. R. E.</given-names></name> <name><surname>Cotton</surname> <given-names>P. A.</given-names></name> <name><surname>Shipway</surname> <given-names>J. R.</given-names></name></person-group> (<year>2023</year>). <article-title>Chatting and cheating: ensuring academic integrity in the era of ChatGPT</article-title>. <source>Innov. Educ. Teach. Int.</source> <volume>61</volume>, <fpage>228</fpage>&#x2013;<lpage>239</lpage>. doi: <pub-id pub-id-type="doi">10.1080/14703297.2023.2190148</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daly</surname> <given-names>J. A.</given-names></name> <name><surname>Miller</surname> <given-names>M. D.</given-names></name></person-group> (<year>1975</year>). <article-title>The empirical development of an instrument to measure writing apprehension</article-title>. <source>Res. Teach. Engl.</source> <volume>9</volume>, <fpage>242</fpage>&#x2013;<lpage>249</lpage>. doi: <pub-id pub-id-type="doi">10.58680/rte197520067</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deng</surname> <given-names>Y.</given-names></name> <name><surname>Liu</surname> <given-names>D.</given-names></name> <name><surname>Feng</surname> <given-names>D. (William)</given-names></name></person-group> (<year>2023</year>). <article-title>Students' perceptions of peer review for assessing digital multimodal composing: the case of a discipline-specific English course</article-title>. <source>Assess. Eval. High. Educ.</source>, <volume>48</volume>, <fpage>1254</fpage>&#x2013;<lpage>1267</lpage>. doi: <pub-id pub-id-type="doi">10.1080/02602938.2023.2227358</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="book"><person-group person-group-type="author"><name><surname>DeVellis</surname> <given-names>R. F.</given-names></name></person-group> (<year>2017</year>). <source>Scale development: Theory and applications</source>. <edition>4th</edition> Edn. <publisher-loc>Sage</publisher-loc>.</citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Faigley</surname> <given-names>L.</given-names></name> <name><surname>Daly</surname> <given-names>J. A.</given-names></name> <name><surname>Witte</surname> <given-names>S. P.</given-names></name></person-group> (<year>1981</year>). <article-title>The role of writing apprehension in writing performance and competence</article-title>. <source>J. Educ. Res.</source> <volume>75</volume>, <fpage>16</fpage>&#x2013;<lpage>21</lpage>. doi: <pub-id pub-id-type="doi">10.1080/00220671.1981.10885348</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Flower</surname> <given-names>L.</given-names></name> <name><surname>Hayes</surname> <given-names>J. R.</given-names></name></person-group> (<year>1980</year>). &#x201C;<article-title>The dynamics of composing: making plans and juggling constraints</article-title>&#x201D; in <source>Cognitive processes in writing</source>. eds. <person-group person-group-type="editor"><name><surname>Gregg</surname> <given-names>L.</given-names></name> <name><surname>Steinberg</surname> <given-names>E.</given-names></name></person-group> (<publisher-loc>New Jersey, USA</publisher-loc>: <publisher-name>Lawrence Erlbaum</publisher-name>), <fpage>31</fpage>&#x2013;<lpage>50</lpage>.</citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flower</surname> <given-names>L.</given-names></name> <name><surname>Hayes</surname> <given-names>J. R.</given-names></name></person-group> (<year>1981</year>). <article-title>A cognitive process theory of writing</article-title>. <source>Coll. Compos. Commun.</source> <volume>32</volume>, <fpage>365</fpage>&#x2013;<lpage>387</lpage>. doi: <pub-id pub-id-type="doi">10.58680/ccc198115885</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Granena</surname> <given-names>G.</given-names></name></person-group> (<year>2023</year>). <article-title>Cognitive individual differences in the process and product of L2 writing</article-title>. <source>Stud. Second. Lang. Acquis.</source> <volume>45</volume>, <fpage>765</fpage>&#x2013;<lpage>785</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0272263123000347</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Hayes</surname> <given-names>J. R.</given-names></name></person-group> (<year>1996</year>). &#x201C;<article-title>A new framework for understanding cognition and affect in writing</article-title>&#x201D; in <source>The science of writing: Theories, methods, individual differences and applications</source>. eds. <person-group person-group-type="editor"><name><surname>Levy</surname> <given-names>C. M.</given-names></name> <name><surname>Ransdell</surname> <given-names>S.</given-names></name></person-group> (<publisher-loc>New Jersey, USA</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates</publisher-name>), <fpage>1</fpage>&#x2013;<lpage>27</lpage>.</citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hayes</surname> <given-names>J. R.</given-names></name></person-group> (<year>2012</year>). <article-title>Modeling and remodeling writing</article-title>. <source>Written Commun.</source> <volume>29</volume>, <fpage>369</fpage>&#x2013;<lpage>388</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0741088312451260</pub-id></citation></ref>
<ref id="ref9001"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>L.-T.</given-names></name> <name><surname>Bentler</surname> <given-names>P. M.</given-names></name></person-group> (<year>1999</year>). <article-title>Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives</article-title>. <source>Structural Equation Modeling</source>, <volume>6</volume>, <fpage>1</fpage>&#x2013;<lpage>55</lpage>. doi: <pub-id pub-id-type="doi">10.1080/10705519909540118</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>D.</given-names></name> <name><surname>Kalyuga</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Learning English as a foreign language writing skills in collaborative settings: a cognitive load perspective</article-title>. <source>Front. Psychol.</source> <volume>13</volume>:<fpage>932291</fpage>. doi: <pub-id pub-id-type="doi">10.3389/fpsyg.2022.932291</pub-id>, PMID: <pub-id pub-id-type="pmid">35846619</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Kellogg</surname> <given-names>R. T.</given-names></name></person-group> (<year>1996</year>). &#x201C;<article-title>A model of working memory in writing</article-title>&#x201D; in <source>The science of writing: Theories, methods, individual differences and applications</source>. eds. <person-group person-group-type="editor"><name><surname>Levy</surname> <given-names>C. M.</given-names></name> <name><surname>Ransdell</surname> <given-names>S. E.</given-names></name></person-group> (<publisher-loc>New Jersey, USA</publisher-loc>: <publisher-name>Lawrence Erlbaum</publisher-name>), <fpage>57</fpage>&#x2013;<lpage>71</lpage>.</citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kellogg</surname> <given-names>R. T.</given-names></name></person-group> (<year>2001</year>). <article-title>Long-term working memory in text production</article-title>. <source>Mem. Cogn.</source> <volume>29</volume>, <fpage>43</fpage>&#x2013;<lpage>52</lpage>. doi: <pub-id pub-id-type="doi">10.3758/BF03195739</pub-id>, PMID: <pub-id pub-id-type="pmid">11277463</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Klassen</surname> <given-names>R.</given-names></name></person-group> (<year>2003</year>). <article-title>Writing in early adolescence: a review of the role of self-efficacy beliefs</article-title>. <source>Educ. Psychol. Rev.</source> <volume>14</volume>, <fpage>173</fpage>&#x2013;<lpage>203</lpage>. doi: <pub-id pub-id-type="doi">10.1023/A:1014626805572</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kormos</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>The role of individual differences in L2 writing</article-title>. <source>J. Sec. Lang. Writ.</source> <volume>21</volume>, <fpage>390</fpage>&#x2013;<lpage>403</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jslw.2012.09.003</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kormos</surname> <given-names>J.</given-names></name></person-group> (<year>2023</year>). <article-title>The role of cognitive factors in second language writing and writing to learn a second language</article-title>. <source>Stud. Sec. Lang. Acquis.</source> <volume>45</volume>, <fpage>622</fpage>&#x2013;<lpage>646</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0272263122000481</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>S. Y.</given-names></name></person-group> (<year>2005</year>). <article-title>Facilitating and inhibiting factors on EFL writing: a model testing with SEM</article-title>. <source>Lang. Learn.</source> <volume>55</volume>, <fpage>335</fpage>&#x2013;<lpage>374</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.0023-8333.2005.00306.x</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>C. C.</given-names></name> <name><surname>Tan</surname> <given-names>S. C.</given-names></name></person-group> (<year>2010</year>). <article-title>Scaffolding writing using feedback in students&#x2019; graphic organizers&#x2013;novice writers&#x2019; relevance of ideas and cognitive loads</article-title>. <source>Educ. Media Int.</source> <volume>47</volume>, <fpage>135</fpage>&#x2013;<lpage>152</lpage>. doi: <pub-id pub-id-type="doi">10.1080/09523987.2010.492678</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name></person-group> (<year>2024</year>). <article-title>A measure of EFL argumentative writing cognitive load: scale development and validation</article-title>. <source>J. Second. Lang. Writ.</source> <volume>63</volume>:<fpage>101095</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jslw.2024.101095</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>C.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name> <name><surname>Lu</surname> <given-names>X.</given-names></name></person-group> (<year>2024</year>). <article-title>Task complexity and L2 writing performance of young learners: contributions of cognitive and affective factors</article-title>. <source>Mod. Lang. J.</source> <volume>108</volume>, <fpage>741</fpage>&#x2013;<lpage>770</lpage>. doi: <pub-id pub-id-type="doi">10.1111/modl.12954</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>L. J.</given-names></name> <name><surname>Biebricher</surname> <given-names>C.</given-names></name></person-group> (<year>2024</year>). <article-title>Investigating students' cognitive processes in generative AI-assisted digital multimodal composing and traditional writing</article-title>. <source>Comput. Educ.</source> <volume>211</volume>:<fpage>104977</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.compedu.2023.104977</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Lucy</surname> <given-names>L.</given-names></name> <name><surname>Bamman</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>Gender and representation bias in GPT-3 generated stories</article-title>. In <conf-name>Proceedings of the third workshop on narrative understanding</conf-name> (pp. <fpage>48</fpage>&#x2013;<lpage>55</lpage>). <publisher-name>Association for Computational Linguistics</publisher-name>.</citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lund</surname> <given-names>B. D.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name> <name><surname>Mannuru</surname> <given-names>N. R.</given-names></name> <name><surname>Nie</surname> <given-names>B.</given-names></name> <name><surname>Shimray</surname> <given-names>S.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name></person-group> (<year>2023</year>). <article-title>ChatGPT and a new academic reality: artificial intelligence-written research papers and the ethics of the large language models in scholarly publishing</article-title>. <source>J. Assoc. Inf. Sci. Technol.</source> <volume>74</volume>, <fpage>570</fpage>&#x2013;<lpage>581</lpage>. doi: <pub-id pub-id-type="doi">10.1002/asi.24750</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Manch&#x00F3;n</surname> <given-names>R. M.</given-names></name> <name><surname>McBride</surname> <given-names>S.</given-names></name> <name><surname>Mellado</surname> <given-names>M. D.</given-names></name> <name><surname>Vasylets</surname> <given-names>O.</given-names></name></person-group> (<year>2023</year>). <article-title>Working memory, L2 proficiency, and task complexity: independent and interactive effects on L2 written performance</article-title>. <source>Stud. Second. Lang. Acquis.</source> <volume>45</volume>, <fpage>737</fpage>&#x2013;<lpage>764</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0272263123000141</pub-id></citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCutchen</surname> <given-names>D.</given-names></name></person-group> (<year>2000</year>). <article-title>Knowledge, processing, and working memory: implications for a theory of writing</article-title>. <source>Educ. Psychol.</source> <volume>35</volume>, <fpage>13</fpage>&#x2013;<lpage>23</lpage>. doi: <pub-id pub-id-type="doi">10.1207/S15326985EP3501_3</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McLeod</surname> <given-names>S.</given-names></name></person-group> (<year>1987</year>). <article-title>Some thoughts about feelings: the affective domain and the writing process</article-title>. <source>Coll. Compos. Commun.</source> <volume>38</volume>, <fpage>426</fpage>&#x2013;<lpage>435</lpage>.</citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>N&#x00FC;ckles</surname> <given-names>M.</given-names></name> <name><surname>Roelle</surname> <given-names>J.</given-names></name> <name><surname>Glogger-Frey</surname> <given-names>I.</given-names></name> <name><surname>Waldeyer</surname> <given-names>J.</given-names></name> <name><surname>Renkl</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>The self-regulation-view in writing-to-learn: using journal writing to optimize cognitive load in self-regulated learning</article-title>. <source>Educ. Psychol. Rev.</source> <volume>32</volume>, <fpage>1089</fpage>&#x2013;<lpage>1126</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10648-020-09541-1</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Olive</surname> <given-names>T.</given-names></name></person-group> (<year>2004</year>). <article-title>Working memory in writing: empirical evidence from the dual-task technique</article-title>. <source>Eur. Psychol.</source> <volume>9</volume>, <fpage>32</fpage>&#x2013;<lpage>42</lpage>. doi: <pub-id pub-id-type="doi">10.1027/1016-9040.9.1.32</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paas</surname> <given-names>F.</given-names></name></person-group> (<year>1992</year>). <article-title>Training strategies for attaining transfer of problem-solving skill in statistics: a cognitive load approach</article-title>. <source>J. Educ. Psychol.</source> <volume>84</volume>, <fpage>429</fpage>&#x2013;<lpage>434</lpage>. doi: <pub-id pub-id-type="doi">10.1037/0022-0663.84.4.429</pub-id></citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paas</surname> <given-names>F.</given-names></name> <name><surname>Renkl</surname> <given-names>A.</given-names></name> <name><surname>Sweller</surname> <given-names>J.</given-names></name></person-group> (<year>2003</year>). <article-title>Cognitive load theory and instructional design: recent developments</article-title>. <source>Educ. Psychol.</source> <volume>38</volume>, <fpage>1</fpage>&#x2013;<lpage>4</lpage>. doi: <pub-id pub-id-type="doi">10.1207/S15326985EP3801_1</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pajares</surname> <given-names>F.</given-names></name></person-group> (<year>2003</year>). <article-title>Self-efficacy beliefs, motivation, and achievement in writing: a review of the literature</article-title>. <source>Read. Writ. Q.</source> <volume>19</volume>, <fpage>139</fpage>&#x2013;<lpage>158</lpage>. doi: <pub-id pub-id-type="doi">10.1080/10573560308222</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pajares</surname> <given-names>F.</given-names></name> <name><surname>Johnson</surname> <given-names>M. J.</given-names></name></person-group> (<year>1994</year>). <article-title>Confidence and competence in writing: the role of writing self-efficacy, outcome expectancy and apprehension</article-title>. <source>Res. Teach. Engl.</source> <volume>28</volume>, <fpage>313</fpage>&#x2013;<lpage>331</lpage>.</citation></ref>
<ref id="ref43"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Pajares</surname> <given-names>F.</given-names></name> <name><surname>Valiante</surname> <given-names>G.</given-names></name></person-group> (<year>2006</year>). &#x201C;<article-title>Self-efficacy beliefs and motivation in writing development</article-title>&#x201D; in <source>Handbook of writing research</source>. eds. <person-group person-group-type="editor"><name><surname>MacArthur</surname> <given-names>C. A.</given-names></name> <name><surname>Graham</surname> <given-names>S.</given-names></name> <name><surname>Fitzgerald</surname> <given-names>J.</given-names></name></person-group> (<publisher-loc>New York, USA</publisher-loc>: <publisher-name>Guilford Press</publisher-name>), <fpage>158</fpage>&#x2013;<lpage>170</lpage>.</citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prat-Sala</surname> <given-names>M.</given-names></name> <name><surname>Redford</surname> <given-names>P.</given-names></name></person-group> (<year>2012</year>). <article-title>Writing essays: does self-efficacy matter? The relationship between self-efficacy in reading and in writing and undergraduate students&#x2019; performance in essay writing</article-title>. <source>Educ. Psychol.</source> <volume>32</volume>, <fpage>9</fpage>&#x2013;<lpage>20</lpage>. doi: <pub-id pub-id-type="doi">10.1080/01443410.2011.621411</pub-id></citation></ref>
<ref id="ref45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>R&#x00E9;v&#x00E9;sz</surname> <given-names>A.</given-names></name> <name><surname>Michel</surname> <given-names>M.</given-names></name> <name><surname>Gilabert</surname> <given-names>R.</given-names></name></person-group> (<year>2016</year>). <article-title>Measuring cognitive task demands using dual-task methodology, subjective self-ratings, and expert judgments: a validation study</article-title>. <source>Stud. Second. Lang. Acquis.</source> <volume>38</volume>, <fpage>703</fpage>&#x2013;<lpage>737</lpage>. doi: <pub-id pub-id-type="doi">10.1017/S0272263115000339</pub-id></citation></ref>
<ref id="ref46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robinson</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Task complexity, task difficulty, and task production: exploring interactions in a componential framework</article-title>. <source>Appl. Linguist.</source> <volume>22</volume>, <fpage>27</fpage>&#x2013;<lpage>57</lpage>. doi: <pub-id pub-id-type="doi">10.1093/applin/22.1.27</pub-id></citation></ref>
<ref id="ref47"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Scardamalia</surname> <given-names>M.</given-names></name></person-group> (<year>1981</year>). &#x201C;<article-title>How children cope with the cognitive demands of writing</article-title>&#x201D; in <source>Writing: The nature, development, and teaching of written communication</source>. eds. <person-group person-group-type="editor"><name><surname>Frederiksen</surname> <given-names>C. H.</given-names></name> <name><surname>Dominic</surname> <given-names>J. F.</given-names></name></person-group> (<publisher-loc>New Jersey, USA</publisher-loc>: <publisher-name>Lawrence Erlbaum</publisher-name>), <fpage>81</fpage>&#x2013;<lpage>103</lpage>.</citation></ref>
<ref id="ref48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schunk</surname> <given-names>D. H.</given-names></name></person-group> (<year>2003</year>). <article-title>Self-efficacy for reading and writing: influence of modeling, goal setting, and self-evaluation</article-title>. <source>Read. Writ. Q.</source> <volume>19</volume>, <fpage>159</fpage>&#x2013;<lpage>172</lpage>. doi: <pub-id pub-id-type="doi">10.1080/10573560308219</pub-id></citation></ref>
<ref id="ref49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Su</surname> <given-names>Y.</given-names></name> <name><surname>Lin</surname> <given-names>Y.</given-names></name> <name><surname>Lai</surname> <given-names>C.</given-names></name></person-group> (<year>2023</year>). <article-title>Collaborating with ChatGPT in argumentative writing classrooms</article-title>. <source>Assess. Writing</source> <volume>57</volume>:<fpage>100752</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.asw.2023.100752</pub-id></citation></ref>
<ref id="ref50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sweller</surname> <given-names>J.</given-names></name></person-group> (<year>1988</year>). <article-title>Cognitive load during problem solving: effects on learning</article-title>. <source>Cogn. Sci.</source> <volume>12</volume>, <fpage>257</fpage>&#x2013;<lpage>285</lpage>. doi: <pub-id pub-id-type="doi">10.1207/s15516709cog1202_4</pub-id></citation></ref>
<ref id="ref51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sweller</surname> <given-names>J.</given-names></name></person-group> (<year>2010</year>). <article-title>Element interactivity and intrinsic, extraneous, and germane cognitive load</article-title>. <source>Educ. Psychol. Rev.</source> <volume>22</volume>, <fpage>123</fpage>&#x2013;<lpage>138</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10648-010-9128-5</pub-id></citation></ref>
<ref id="ref52"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Sweller</surname> <given-names>J.</given-names></name></person-group> (<year>2011</year>). &#x201C;<article-title>Cognitive load theory</article-title>&#x201D; in <source>Psychology of learning and motivation</source>, vol. <volume>55</volume> (<publisher-loc>New York, USA</publisher-loc>: <publisher-name>Academic Press</publisher-name>), <fpage>37</fpage>&#x2013;<lpage>76</lpage>.</citation></ref>
<ref id="ref53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sweller</surname> <given-names>J.</given-names></name> <name><surname>Van Merrienboer</surname> <given-names>J. J.</given-names></name> <name><surname>Paas</surname> <given-names>F. G.</given-names></name></person-group> (<year>1998</year>). <article-title>Cognitive architecture and instructional design</article-title>. <source>Educ. Psychol. Rev.</source> <volume>10</volume>, <fpage>251</fpage>&#x2013;<lpage>296</lpage>. doi: <pub-id pub-id-type="doi">10.1023/A:1022193728205</pub-id></citation></ref>
<ref id="ref54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sweller</surname> <given-names>J.</given-names></name> <name><surname>Van Merri&#x00EB;nboer</surname> <given-names>J. J.</given-names></name> <name><surname>Paas</surname> <given-names>F.</given-names></name></person-group> (<year>2019</year>). <article-title>Cognitive architecture and instructional design: 20 years later</article-title>. <source>Educ. Psychol. Rev.</source> <volume>31</volume>, <fpage>261</fpage>&#x2013;<lpage>292</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10648-019-09465-5</pub-id></citation></ref>
<ref id="ref55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walters</surname> <given-names>W. H.</given-names></name> <name><surname>Wilder</surname> <given-names>E. I.</given-names></name></person-group> (<year>2023</year>). <article-title>Fabrication and errors in the bibliographic citations generated by ChatGPT</article-title>. <source>Sci. Rep.</source> <volume>13</volume>:<fpage>14045</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-023-41032-5</pub-id>, PMID: <pub-id pub-id-type="pmid">37679503</pub-id></citation></ref>
<ref id="ref56"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Weigle</surname> <given-names>S. C.</given-names></name></person-group> (<year>2005</year>). &#x201C;<article-title>Second language writing expertise</article-title>&#x201D; in <source>Expertise in second language learning and teaching</source>. ed. <person-group person-group-type="editor"><name><surname>Johnson</surname> <given-names>K.</given-names></name></person-group> (<publisher-loc>London, UK</publisher-loc>: <publisher-name>Palgrave Macmillan</publisher-name>), <fpage>128</fpage>&#x2013;<lpage>149</lpage>.</citation></ref>
<ref id="ref9003"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Yu</surname> <given-names>C. Y.</given-names></name></person-group> (<year>2002</year>). <article-title>Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes</article-title>. <source>Doctoral dissertation</source>. <publisher-loc>Los Angeles</publisher-loc>: <publisher-name>University of California</publisher-name>.</citation></ref>
<ref id="ref57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zabihi</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <article-title>The role of cognitive and affective factors in measures of L2 writing</article-title>. <source>Written Commun.</source> <volume>35</volume>, <fpage>32</fpage>&#x2013;<lpage>57</lpage>. doi: <pub-id pub-id-type="doi">10.1177/0741088317735836</pub-id></citation></ref>
<ref id="ref58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>X.</given-names></name></person-group> (<year>2022</year>). <article-title>Leveraging artificial intelligence (AI) technology for English writing: introducing Wordtune as a digital writing assistant for EFL writers</article-title>. <source>RELC J.</source> <volume>54</volume>, <fpage>856</fpage>&#x2013;<lpage>862</lpage>. doi: <pub-id pub-id-type="doi">10.1177/00336882221094089</pub-id></citation></ref>
<ref id="ref59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zimmermann</surname> <given-names>R.</given-names></name></person-group> (<year>2000</year>). <article-title>L2 writing: subprocesses, a model of formulating and empirical findings</article-title>. <source>Learn. Instr.</source> <volume>10</volume>, <fpage>73</fpage>&#x2013;<lpage>99</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0959-4752(99)00019-5</pub-id></citation></ref>
</ref-list>
</back>
</article>