<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3-mathml3.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" dtd-version="1.3" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Educ.</journal-id>
<journal-title-group>
<journal-title>Frontiers in Education</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Educ.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2504-284X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/feduc.2025.1620029</article-id>
<article-version article-version-type="Version of Record" vocab="NISO-RP-8-2008"/>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Original Research</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Advancing textbook evaluation with debiased machine learning: a theoretical and empirical approach</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Bian</surname> <given-names>Yong</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<role vocab="credit" vocab-identifier="https://credit.niso.org/" vocab-term="Writing &#x2013; review &amp; editing" vocab-term-identifier="https://credit.niso.org/contributor-roles/writing-review-editing/">Writing &#x2013; review &#x00026; editing</role>
<role vocab="credit" vocab-identifier="https://credit.niso.org/" vocab-term="Writing &#x2013; original draft" vocab-term-identifier="https://credit.niso.org/contributor-roles/writing-original-draft/">Writing &#x2013; original draft</role>
<uri xlink:href="https://loop.frontiersin.org/people/1922414"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Fang</surname> <given-names>Zhou</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<role vocab="credit" vocab-identifier="https://credit.niso.org/" vocab-term="Writing &#x2013; original draft" vocab-term-identifier="https://credit.niso.org/contributor-roles/writing-original-draft/">Writing &#x2013; original draft</role>
<role vocab="credit" vocab-identifier="https://credit.niso.org/" vocab-term="Writing &#x2013; review &amp; editing" vocab-term-identifier="https://credit.niso.org/contributor-roles/writing-review-editing/">Writing &#x2013; review &#x00026; editing</role>
<uri xlink:href="https://loop.frontiersin.org/people/1256222"/>
</contrib>
</contrib-group>
<aff id="aff1"><label>1</label><institution>Department of Family Medicine and Public Health Sciences, Wayne State University</institution>, <city>Detroit, MI</city>, <country country="us">United States</country></aff>
<aff id="aff2"><label>2</label><institution>Center for Molecular Medicine and Genetics (CMMG), Wayne State University</institution>, <city>Detroit, MI</city>, <country country="us">United States</country></aff>
<aff id="aff3"><label>3</label><institution>Department of Psychological &#x00026; Brain Sciences, Texas A&#x00026;M University</institution>, <city>College Station, TX</city>, <country country="us">United States</country></aff>
<author-notes>
<corresp id="c001"><label>&#x0002A;</label>Correspondence: Yong Bian, <email xlink:href="mailto:oliviabiany@hotmail.com">oliviabiany@hotmail.com</email></corresp>
<fn fn-type="other" id="fn001"><label>&#x02020;</label><p>ORCID: Yong Bian <uri xlink:href="https://orcid.org/0000-0002-0557-3145">orcid.org/0000-0002-0557-3145</uri></p></fn>
</author-notes>
<pub-date publication-format="electronic" date-type="pub" iso-8601-date="2026-01-12">
<day>12</day>
<month>01</month>
<year>2026</year>
</pub-date>
<pub-date publication-format="electronic" date-type="collection">
<year>2025</year>
</pub-date>
<volume>10</volume>
<elocation-id>1620029</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>05</month>
<year>2025</year>
</date>
<date date-type="rev-recd">
<day>08</day>
<month>12</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>12</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2026 Bian and Fang.</copyright-statement>
<copyright-year>2026</copyright-year>
<copyright-holder>Bian and Fang</copyright-holder>
<license>
<ali:license_ref start_date="2026-01-12">https://creativecommons.org/licenses/by/4.0/</ali:license_ref>
<license-p>This is an open-access article distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License (CC BY)</ext-link>. The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</license-p>
</license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>Textbooks can substantially influence student achievement, but common evaluation approaches (e.g., linear regression) often depend on strong functional-form assumptions that may misstate causal effects. This study presents Double/Debiased Machine Learning (DML) as a more flexible framework for estimating the causal impact of textbooks on learning outcomes.</p></sec>
<sec>
<title>Methods</title>
<p>We use DML to estimate textbook effects while allowing high-dimensional, non-parametric modeling of outcome and treatment assignment processes. We (1) derive the theoretical advantages of DML-particularly its robustness to model misspecification and its use of orthogonalized estimating equations-and (2) apply the approach to an existing large-scale elementary school mathematics curriculum dataset. We compare DML estimates to those produced by Ordinary Least Squares (OLS) regression and Kernel matching, focusing on precision and efficiency of causal effect estimation.</p></sec>
<sec>
<title>Results</title>
<p>Across the empirical application, DML yields more precise and efficient estimates of textbook effects than OLS and Kernel matching. The approach reduces reliance on restrictive linearity assumptions and improves the stability of estimated causal impacts in settings where relationships between covariates, curriculum assignment, and outcomes are complex.</p></sec>
<sec>
<title>Discussion</title>
<p>These findings indicate that DML is a robust alternative for evaluating educational materials, offering clearer evidence to inform curriculum selection and adoption decisions. More broadly, the study contributes methodologically to learning and intelligence research by strengthening the tools used to measure educational interventions&#x00027; effects on achievement.</p></sec></abstract>
<kwd-group>
<kwd>California Math</kwd>
<kwd>causal</kwd>
<kwd>DML</kwd>
<kwd>estimation</kwd>
<kwd>non-parametric modeling</kwd>
</kwd-group>
<funding-group>
<funding-statement>The author(s) declared that financial support was not received for this work and/or its publication.</funding-statement>
</funding-group>
<counts>
<fig-count count="0"/>
<table-count count="8"/>
<equation-count count="15"/>
<ref-count count="25"/>
<page-count count="9"/>
<word-count count="6626"/>
</counts>
<custom-meta-group>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>STEM Education</meta-value>
</custom-meta>
</custom-meta-group>
</article-meta>
</front>
<body>
<sec sec-type="introduction" id="s1">
<label>1</label>
<title>Introduction</title>
<p>Student learning outcomes are significantly influenced by the quality of instructional materials. Textbooks continue to be one of the most widely available resources for educational advancement among the many factors that affect academic success. Textbook selection provides schools with an effective way to improve student performance with relatively little effort, in contrast to other interventions that call for intensive training or structural modifications (<xref ref-type="bibr" rid="B2">Arumuru and David, 2024</xref>; <xref ref-type="bibr" rid="B20">Li and Wang, 2024</xref>).</p>
<p>This pattern holds true in K&#x02212;12 mathematics education as well: improving mathematics education achievement through textbook selection is a common and widely accepted practice (<xref ref-type="bibr" rid="B25">Slavin and Lake, 2008</xref>). Scholars have attempted to quantify the impact of textbook selection on academic performance. However, upon reviewing the existing literature on mathematics textbook selection, we found that the selection algorithm is not well-developed. <xref ref-type="table" rid="T1">Table 1</xref> summarizes key papers that have examined mathematics textbook selection and their methodological approaches.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Information on selected papers studying mathematics textbook selection.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Title</bold></th>
<th valign="top" align="left"><bold>Research object</bold></th>
<th valign="top" align="left"><bold>Evaluation method used</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Are first graders&#x00027; arithmetic skills related to the quality of mathematics textbooks? A study on students&#x00027; use of arithmetic principles (<xref ref-type="bibr" rid="B24">Sievert et al., 2021</xref>)</td>
<td valign="top" align="left">2,462 students from 40 schools and 127 classes in Schleswig-Holstein</td>
<td valign="top" align="left">OLS (multilevel)</td>
</tr>
<tr>
<td valign="top" align="left">Curriculum reform in the common core era: evaluating elementary math textbooks across six U.S. States (<xref ref-type="bibr" rid="B7">Blazar et al., 2020</xref>)</td>
<td valign="top" align="left">Over 6,000 schools across six states: California, Louisiana, Maryland, New Jersey, New Mexico, and Washington</td>
<td valign="top" align="left">OLS</td>
</tr>
<tr>
<td valign="top" align="left">The formalized processes districts use to evaluate mathematics textbooks (<xref ref-type="bibr" rid="B23">Polikoff et al., 2019</xref>)</td>
<td valign="top" align="left">34 education leaders</td>
<td valign="top" align="left">Interview</td>
</tr>
<tr>
<td valign="top" align="left">Learning by the book: comparing math achievement growth by textbook in six Common Core states (<xref ref-type="bibr" rid="B8">Blazar et al., 2019</xref>)</td>
<td valign="top" align="left">5,107 schools from 6 states</td>
<td valign="top" align="left">OLS</td>
</tr>
<tr>
<td valign="top" align="left">Mathematics curriculum effects on student achievement in California (<xref ref-type="bibr" rid="B17">Koedel et al., 2017</xref>)</td>
<td valign="top" align="left">5,494 schools in California</td>
<td valign="top" align="left">Kernel matching and restricted OLS</td>
</tr>
<tr>
<td valign="top" align="left">Opportunities to learn: mathematics textbooks and students&#x00027; achievements (<xref ref-type="bibr" rid="B14">Hadar, 2017</xref>)</td>
<td valign="top" align="left">4,040 eighth-grade students in an Arab community</td>
<td valign="top" align="left">OLS (hierarchical)</td>
</tr>
<tr>
<td valign="top" align="left">Big bang for just a few bucks: The impact of math textbooks in California (<xref ref-type="bibr" rid="B18">Koedel and Polikoff, 2017</xref>)</td>
<td valign="top" align="left">&#x0007E;7,600 schools in California state that serve grades K-8</td>
<td valign="top" align="left">Kernel matching, restricted OLS, and remnant-based residualized matching</td>
</tr>
<tr>
<td valign="top" align="left">How well aligned are textbooks to the common core standards in mathematics? (<xref ref-type="bibr" rid="B22">Polikoff, 2015</xref>)</td>
<td valign="top" align="left">7 math books</td>
<td valign="top" align="left">Grading (Alignment index)</td>
</tr>
<tr>
<td valign="top" align="left">Is curriculum quality uniform? Evidence from Florida (<xref ref-type="bibr" rid="B5">Bhatt et al., 2013</xref>)</td>
<td valign="top" align="left">1,205 schools in Florida</td>
<td valign="top" align="left">OLS and Kernel matching</td>
</tr>
<tr>
<td valign="top" align="left">Large-scale evaluations of curricular effectiveness (<xref ref-type="bibr" rid="B4">Bhatt and Koedel, 2012</xref>)</td>
<td valign="top" align="left">716 schools in Indiana</td>
<td valign="top" align="left">OLS, Kernel matching, and LLR matching</td>
</tr></tbody>
</table>
</table-wrap>
<p>Traditional causal effect analysis methods, such as fixed effects, difference-in-differences, and two-stage least squares (2SLS), are commonly used in empirical studies, often through linear models. However, linear models can be a strong assumption when modeling aims to uncover causal relationships. Linear partial effect estimations may have limitations in capturing true causality that is not constant or has complicated forms. While some may argue that adding polynomials and interaction terms to a linear model can capture non-constant and heterogeneous effects, the number of regressors must be limited by the sample size. The number of parameters must be substantially less than the number of observations to obtain precise estimates. OLS performs poorly when the sample size is limited, and the number of parameters is large or even increases with sample size (<xref ref-type="bibr" rid="B21">Liu et al., 2023</xref>).</p>
<p>Linear models serve different purposes depending on the situation. OLS regression captures the overall features of the data but may be less flexible in revealing details (<xref ref-type="bibr" rid="B1">Acito, 2023</xref>). Population relationships can be complex and interact in a non-linear way, making it challenging to establish a linear model. In statistical modeling, the objective is to fit the data and make accurate predictions. One could build local non-linear functions and sum them up, as in the idea from Kernel Smoothing Methods, but this approach can be tedious and inefficient in empirical studies in economics (<xref ref-type="bibr" rid="B3">Batlle et al., 2025</xref>; <xref ref-type="bibr" rid="B16">Hiabu et al., 2019</xref>). Because social scientists are more interested in explanations rather than simply a well-fitted model, they often turn to non-parametric methods such as kernel matching techniques or continue to use a linear model (<xref ref-type="bibr" rid="B15">Heckman et al., 1998</xref>), even if it has low prediction power or poorly fits the data, as long as it has good explanatory properties (<xref ref-type="bibr" rid="B9">Breznau, 2022</xref>).</p>
<p>Even if a linear model is the final &#x0201C;best&#x0201D; choice among various empirical techniques for an economics researcher, it may still be challenging to construct a robust linear-based model. The researcher must decide what interaction terms, year dummies, and fixed-effect terms to add, which depends on subjective judgment or convention. For example, researchers often include as many regressors (characteristic variables) as possible in their regressions to reduce bias and control variance (<xref ref-type="bibr" rid="B19">Li and M&#x000FC;ller, 2021</xref>). However, increasing the number of independent variables in an OLS regression can lead to multicollinearity and overfitting issues (<xref ref-type="bibr" rid="B13">Efeizomor, 2023</xref>). Additionally, when the sample size is smaller than the number of parameters, the existence of nuisance parameters can result in poor performance of traditional OLS regression, even if the interest is only in a small part of the model&#x00027;s parameters (<xref ref-type="bibr" rid="B19">Li and M&#x000FC;ller, 2021</xref>).</p>
<p>Although Kernel matching already addresses most of these concerns, we would like to suggest Double/Debiased Machine Learning (DML) as a suitable alternative. DML is a recently developed, powerful method, which, as a causal estimation technique, combines the prediction power of machine learning to obtain consistent causal estimators under high-dimensional covariates (<xref ref-type="bibr" rid="B11">Chernozhukov et al., 2018</xref>). When predicting the treatment effect, DML utilizes a more general algorithm that integrates all regressors in a non-parametric model, allowing both linear and non-linear functional forms and leading to more precise predictions. DML, as a non-parametric method, allows the correlations between treatment and controls to be of any functional form, making it more general and robust than Kernel matching.</p>
<p>In this study, we will first derive the algorithm mathematically to explain how it addresses the limitations of traditional models. Following this, we will apply a slightly modified DML method to a dataset recently used in an elementary school math curriculum study (<xref ref-type="bibr" rid="B17">Koedel et al., 2017</xref>). The aim of this second step is to compare our results with an existing analysis and demonstrate the superiority of DML under these circumstances. Our causal estimates are more efficient than those of Koedel et al.</p>
<p>We selected this dataset for two reasons. First, in the previous study, the authors used traditional estimation methods of kernel matching and restricted OLS as causal analysis tools, which is a perfect foundation for comparison. Second, the authors expressed concerns about the linear setting in their paper, and our work builds a partially linear model with DML, serving as a legitimate extension of their study.</p>
<p>The remainder of the article is organized as follows. The next section introduces the model and provides a mathematical explanation for its advantages in estimating the textbook effects. Section 3 presents the background, data, and results of the study we aim to compare with, using the new model. Section 4 presents our results using the same dataset and compares them with the previous research. Finally, Section 5 concludes this study.</p></sec>
<sec id="s2">
<label>2</label>
<title>The DML model</title>
<p>In this section, we briefly introduce the models and explain why they are superior. Since our algorithm is based on the original work by <xref ref-type="bibr" rid="B11">Chernozhukov et al. (2018)</xref> reading their paper will be especially helpful to better understand this section. Generally, we first used a partially linear model, with the binary treatment variable, <italic>D</italic>, linearly added to a non-parametric function, <italic>g</italic><sub>0</sub>(.), of the control variables, <bold>X</bold>. We also used a more general one with the binary treatment variable <italic>D</italic> being included in a totally non-parametric function. The partially linear model is:</p>
<disp-formula id="EQ1"><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:mtext>Y</mml:mtext><mml:mo>=</mml:mo><mml:mtext>D</mml:mtext><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mtext>g</mml:mtext></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtext>U</mml:mtext><mml:mo>&#x02223;</mml:mo><mml:mtext>X</mml:mtext><mml:mo>,</mml:mo><mml:mtext>D</mml:mtext></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>D</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext>m</mml:mtext></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>V</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtext>V</mml:mtext><mml:mo>&#x02223;</mml:mo><mml:mtext>X</mml:mtext></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(1)</label></disp-formula>
<p><xref ref-type="disp-formula" rid="EQ1">Equation 1</xref> measures a constant causal effect on every observation, whereas the more general non-parametric model allows the effect to be heterogeneous across observations. Letting the binary variable <italic>D</italic> be involved in the <italic>g</italic><sub>0</sub> function:</p>
<disp-formula id="EQ2"><mml:math id="M2"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:mtext>Y</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext>g</mml:mtext></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>U</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtext>U</mml:mtext><mml:mo>&#x02223;</mml:mo><mml:mtext>X</mml:mtext><mml:mo>,</mml:mo><mml:mtext>D</mml:mtext></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>D</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext>m</mml:mtext></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>V</mml:mi><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtext>V</mml:mtext><mml:mo>&#x02223;</mml:mo><mml:mtext>X</mml:mtext></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo></mml:mtd></mml:mtr><mml:mtr></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(2)</label></disp-formula>
<p>The parameter of interest is the average treatment effect (ATE), which can be derived from <xref ref-type="disp-formula" rid="EQ2">Equation 2</xref>; and when CIA is satisfied, ATE is equal to the model parameter &#x003B8;<sub>0</sub> as shown in <xref ref-type="disp-formula" rid="EQ3">Equation 3</xref>:</p>
<disp-formula id="EQ3"><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(3)</label></disp-formula>
<p><italic>X</italic> affects the treatment through <italic>m</italic><sub>0</sub>(<italic>X</italic>) and the outcome variable through <italic>g</italic><sub>0</sub>(<italic>D, X</italic>). Both <italic>g</italic><sub>0</sub> and <italic>m</italic><sub>0</sub> functions are non-parametric, complicated, and unknown. Additionally, unconfoundedness or the conditional independence assumption (CIA)<xref ref-type="fn" rid="fn0003"><sup>1</sup></xref> must be satisfied if the goal is to identify the causal effect.</p>
<p>The main idea of DML is to build a Neyman-orthogonal score function and split the sample doing cross-fitting. A simple idea of estimating &#x003B8;<sub>0</sub> in (2.1) is to subtract an estimator of <italic>g</italic><sub>0</sub>(<bold>X</bold>) from <italic>Y</italic> and apply the OLS procedure afterward.<xref ref-type="fn" rid="fn0004"><sup>2</sup></xref> A naive estimator of &#x003B8;<sub>0</sub> is then given by:</p>
<disp-formula id="EQ4"><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(4)</label></disp-formula>
<p>In <xref ref-type="disp-formula" rid="EQ4">Equation 4</xref>, <italic>D&#x003B8;</italic>&#x0002B;<italic>g</italic><sub>0</sub>(<bold>X</bold>) is a conditional expectation function (CEF). Functional form of <italic>g</italic><sub>0</sub> is unknown and unrestricted. Maybe it is non-linear and complicated. <italic>D&#x003B8;</italic><sub>0</sub> part is a linear restriction (may not be correct to do so). However, it is hard to know what a true CEF looks like. With a more flexible <italic>g</italic><sub>0</sub>(<bold>X</bold>), the whole function could be close enough to the true CEF as much as possible.</p>
<p>However, this naive way of &#x003B8;<sub>0</sub> estimation cannot provide a properly converged estimator when machine learning is used in the estimation of <italic>g</italic><sub>0</sub>(<bold>X</bold>) <xref ref-type="bibr" rid="B11">Chernozhukov et al. (2018)</xref>. Restrictions in Lasso and Ridge regression, penalty in Neural Nets, and other penalty forms would increase the estimation bias to control variances. Unavoidably, regularization bias is produced. It comes along with the tradeoff between bias and variance in the estimations. The regularization keeps variance small but increases bias. The scaled decomposed estimation error of a naive estimator of a partially linear model is given by:</p>
<disp-formula id="EQ5"><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="false"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>&#x00026;</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>&#x00026;</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(5)</label></disp-formula>
<p>the second term is indeed:</p>
<disp-formula id="EQ6"><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>i</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(6)</label></disp-formula>
<p>Because of the existence of bias, the sum of <italic>n</italic> terms of <inline-formula><mml:math id="M7"><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>i</mml:mtext></mml:mstyle></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> in <xref ref-type="disp-formula" rid="EQ5">Equations 5</xref>, <xref ref-type="disp-formula" rid="EQ6">6</xref> do not have a mean of zero. If we say the convergence rate of an estimator converging in a root mean squared error sense is <inline-formula><mml:math id="M8"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:math></inline-formula>, the convergence rate of <inline-formula><mml:math id="M9"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> to <italic>g</italic><sub>0</sub> is slower than 1/2. We denote the convergence rate of <inline-formula><mml:math id="M10"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> by &#x003C6;<sub><italic>g</italic></sub>, &#x003C6;<sub><italic>g</italic></sub> &#x0003C; 1/2. Thus, the performance of <inline-formula><mml:math id="M11"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is poor. When the sample size is relatively small, because of the slow convergence rate, the estimator may deviate from the true parameter too much.</p>
<p>A general moment condition is:</p>
<disp-formula id="EQ7"><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003C8;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle><mml:mo>;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(7)</label></disp-formula>
<p>where &#x003C8; is a vector of score functions. The function could be in any form, such as a maximum likelihood score function in <xref ref-type="disp-formula" rid="EQ7">Equation 7</xref>, or a GMM moment function; &#x003B7;<sub>0</sub> denotes the true value of nuisance parameters included in <italic>g</italic><sub>0</sub> and <italic>m</italic><sub>0</sub>, &#x003B7;<sub>0</sub>&#x02208;&#x003C4; where &#x003C4; is the nuisance parameter space. In addition, the score function must satisfy an additional condition, where its Gateaux derivative <italic>D</italic><sub><italic>r</italic></sub>[&#x003B7;&#x02212;&#x003B7;<sub>0</sub>] exists and is non-sensitive to the change of nuisance parameters &#x003B7; toward any direction. The Gateaux derivative is:</p>
<disp-formula id="EQ8"><mml:math id="M13"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>r</mml:mi></mml:msub><mml:mo>[</mml:mo><mml:mrow><mml:mi>&#x003B7;</mml:mi><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>]</mml:mo><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mo>&#x02202;</mml:mo><mml:mi>r</mml:mi></mml:msub><mml:mo>{</mml:mo><mml:mrow><mml:mi mathvariant='double-struck'>E</mml:mi><mml:mo stretchy='false'>[</mml:mo><mml:mi>&#x003C8;</mml:mi><mml:mo>(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>;</mml:mo><mml:msub><mml:mi>&#x003B8;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:mi>r</mml:mi><mml:mo>(</mml:mo><mml:mrow><mml:mi>&#x003B7;</mml:mi><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>]</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>,</mml:mo><mml:mtext>&#x02004;</mml:mtext><mml:mi>&#x003B7;</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math><label>(8)</label></disp-formula>
<p>where <italic>r</italic>&#x02208;[0, 1). With the Neyman orthogonality condition holding, <italic>D</italic><sub><italic>r</italic></sub>[&#x003B7;&#x02212;&#x003B7;<sub>0</sub>] exists for all <italic>r</italic>&#x02208;[0, 1), with &#x003B7;&#x02208;&#x003C4; and at <italic>r</italic> &#x0003D; 0,</p>
<disp-formula id="EQ9"><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x02202;</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow></mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mi>&#x003C8;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle><mml:mo>;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>&#x003B7;</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(9)</label></disp-formula>
<p>The two conditions (<xref ref-type="disp-formula" rid="EQ8">Equations 8</xref>, <xref ref-type="disp-formula" rid="EQ9">9</xref>) together constitute the orthogonality of DML estimation.</p>
<p><xref ref-type="disp-formula" rid="EQ10">Equation 10</xref> shows how DML makes the estimator of the true &#x003B8;<sub>0</sub> consistent in a partially linear model. Let <inline-formula><mml:math id="M15"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> denote the consistent DML estimator. The inference on &#x003B8;<sub>0</sub> relies on the score function:</p>
<disp-formula id="EQ10"><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x003C8;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle><mml:mo>;</mml:mo><mml:mi>&#x003B8;</mml:mi><mml:mo>,</mml:mo><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>Y</mml:mi><mml:mo>-</mml:mo><mml:mi>D</mml:mi><mml:mi>&#x003B8;</mml:mi><mml:mo>-</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>-</mml:mo><mml:mi>m</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(10)</label></disp-formula>
<p>which satisfies the moment condition <italic>E</italic>(<italic>VU</italic>) &#x0003D; 0 and the orthogonality condition. After some algebra, a DML estimator of <inline-formula><mml:math id="M17"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is given by,</p>
<disp-formula id="EQ11"><mml:math id="M18"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(11)</label></disp-formula>
<p>where <inline-formula><mml:math id="M19"><mml:mover accent="false"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> is from ML estimation <inline-formula><mml:math id="M20"><mml:mover accent="false"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>D</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. Note that <inline-formula><mml:math id="M21"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M22"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> are obtained by auxiliary sample and <inline-formula><mml:math id="M23"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> is obtained by the main sample in <xref ref-type="disp-formula" rid="EQ11">Equation 11</xref>. The scaled decomposed estimation error is then,</p>
<disp-formula id="EQ12"><mml:math id="M24"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:msup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:msub><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x1D53C;</mml:mi><mml:msup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(12)</label></disp-formula>
<p>Now, this equation can be separated into three parts:</p>
<disp-formula id="EQ13"><mml:math id="M25"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msup><mml:mi>a</mml:mi><mml:mo>&#x02217;</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant='double-struck'>E</mml:mi><mml:msup><mml:mi>V</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msqrt><mml:mi>n</mml:mi></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msub><mml:mi>U</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msup><mml:mi>b</mml:mi><mml:mo>&#x02217;</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi mathvariant='double-struck'>E</mml:mi><mml:msup><mml:mi>V</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msqrt><mml:mi>n</mml:mi></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>m</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>m</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>g</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msup><mml:mi>c</mml:mi><mml:mo>&#x02217;</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mi>o</mml:mi><mml:mi>P</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math><label>(13)</label></disp-formula>
<p>The first term from <xref ref-type="disp-formula" rid="EQ13">Equation 13</xref> converges to a normal distribution under mild condition, <italic>a</italic><sup>&#x0002A;</sup>&#x021DD;<italic>N</italic>(0, &#x003A3;). The second term <italic>b</italic><sup>&#x0002A;</sup> now is determined by the estimation errors of both <italic>m</italic><sub>0</sub>(<bold>X</bold>) and <italic>g</italic><sub>0</sub>(<bold>X</bold>). It contains regularization bias from both of them. The convergence rate depends on specific Machine Learning methods used, usually slower than a square root rate.</p>
<p>The convergence rate of Random Forest estimators depends on its strong features, the rate order has the form of <inline-formula><mml:math id="M26"><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:mo>-</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>75</mml:mn></mml:mrow><mml:mrow><mml:mi>S</mml:mi><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mn>2</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>75</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:math></inline-formula>, where <italic>S</italic> is a subset of features (<xref ref-type="bibr" rid="B6">Biau, 2012</xref>). According to the convergence properties of least squares regression under <italic>L</italic><sub>2</sub> norm (<xref ref-type="bibr" rid="B10">Chen, 2007</xref>), the convergence rate is <inline-formula><mml:math id="M27"><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:mo>-</mml:mo><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>d</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:math></inline-formula>, where <italic>d</italic> is the dimension of the raw explanatory variables and <italic>p</italic> is the assumed degree of smoothness of the CEF (like number of derivatives). We can control the bound by choosing proper p&#x00027;s for any given <italic>d</italic>. We now know <italic>m</italic><sub>0</sub>(<bold>X</bold>) and <italic>g</italic><sub>0</sub>(<bold>X</bold>) are estimated with a slower convergence rate to their true value, but the product of the two makes the whole term converge within a vanishing upper bound. This upper bound is <inline-formula><mml:math id="M28"><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C6;</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C6;</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:math></inline-formula>, where &#x003C6;<sub><italic>m</italic></sub> is the convergence rate of <inline-formula><mml:math id="M29"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> to <italic>m</italic><sub>0</sub> and &#x003C6;<sub><italic>g</italic></sub> is the convergence rate of <inline-formula><mml:math id="M30"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> to <italic>g</italic><sub>0</sub>. Thus, <italic>b</italic><sup>&#x0002A;</sup> vanishes eventually if &#x003C6;<sub><italic>m</italic></sub>&#x0002B;&#x003C6;<sub><italic>g</italic></sub>&#x0003E;1/2<xref ref-type="fn" rid="fn0005"><sup>3</sup></xref>. Another requirement for <inline-formula><mml:math id="M31"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> to be consistent is to control the remainder items in <italic>c</italic><sup>&#x0002A;</sup> and make sure <inline-formula><mml:math id="M32"><mml:msup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>O</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. In partially linear model, terms like:</p>
<disp-formula id="E14"><mml:math id="M33"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>are included in <italic>c</italic><sup>&#x0002A;</sup>. Without sample splitting, the model error terms <italic>V</italic><sub><italic>i</italic></sub> and estimation errors <inline-formula><mml:math id="M34"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>X</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> are generally related. The reason is that in estimating, <inline-formula><mml:math id="M35"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> information contained in observation i has already been used, whereas <italic>V</italic><sub><italic>i</italic></sub> also has information from observation <italic>I</italic>; therefore, the relation between them will cause poor performance of <italic>c</italic><sup>&#x0002A;</sup>. Conditional on the auxiliary sample and with &#x1D53C;(<italic>V</italic><sub><italic>i</italic></sub>&#x02223;<bold>X</bold><sub><italic>i</italic></sub>) &#x0003D; 0, A.8) has a mean of zero and a variance of order <inline-formula><mml:math id="M36"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:msub><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>I</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>g</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>g</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo><mml:msub><mml:mo>&#x02192;</mml:mo><mml:mi>P</mml:mi></mml:msub><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mstyle></mml:mrow></mml:math></inline-formula>.</p>
<p>The score has to satisfy both a moment condition and an orthogonality condition to overcome the regularization bias. The stricter requirement on the score function makes DML different from traditional methods. Meanwhile, sample splitting is also important to remove the bias from overfitting. Because in the model, it requires estimations of nuisance parameters such as <italic>g</italic><sub>0</sub>(&#x000B7;) and <italic>m</italic><sub>0</sub>(&#x000B7;), as well as causal parameters, and they are not estimated simultaneously. Therefore, using a different part of the data for estimating a different part is needed.</p>
<p>In the estimation of both models above, the sample of (<italic>D, X</italic>) needs to be i.i.d. (independent, identically, and distributed). Sample splitting plays an important role. First, let us divide the sample into <italic>K</italic> folds randomly, such that each subsample <italic>I</italic><sub><italic>k</italic></sub> contains <italic>N</italic>/<italic>K</italic> number of observations, where <italic>k</italic>&#x02208;1, &#x02026;, <italic>K</italic>. The final DML estimator <inline-formula><mml:math id="M37"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B8;</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> solves:</p>
<disp-formula id="E15"><mml:math id="M38"><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mtext>K</mml:mtext></mml:mfrac><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mtext>k</mml:mtext><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mtext>K</mml:mtext></mml:munderover><mml:mrow><mml:msub><mml:mi mathvariant='double-struck'>E</mml:mi><mml:mrow><mml:mtext>nk</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:mstyle><mml:mo>[</mml:mo><mml:mrow><mml:mi>&#x003C8;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>D</mml:mi><mml:mo>,</mml:mo><mml:mi>X</mml:mi><mml:mo>;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003B8;</mml:mi><mml:mo>&#x002DC;</mml:mo></mml:mover><mml:mn>0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>&#x003B7;</mml:mi><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></disp-formula>
<p>where &#x003C8;(&#x000B7;) is the Neyman-orthogonal score function; <inline-formula><mml:math id="M39"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> represents the estimator of nuisance parameters associated with <italic>g</italic><sub>0</sub>(.) and <italic>m</italic><sub>0</sub>(.); &#x1D53C;<sub><italic>nk</italic></sub> is the empirical expectation over <italic>k</italic> th fold of the data. For each subsample <italic>I</italic><sub><italic>k</italic></sub>, its corresponding auxiliary sample is used to construct an ML estimator of <inline-formula><mml:math id="M40"><mml:msub><mml:mrow><mml:mover accent="false"><mml:mrow><mml:mi>&#x003B7;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>.</p></sec>
<sec id="s3">
<label>3</label>
<title>Data</title>
<p>Our analysis aims to compare with the results of a previous study on the effects of curriculum materials on student achievement based on California data <xref ref-type="bibr" rid="B17">Koedel et al., (2017)</xref>. In this section, we will briefly introduce the background, data, and results of the previous study that we based our work on.</p>
<p>In California, the curriculum adoption process is partially centralized, with the state initiating a list of textbooks for a particular subject in a given year. Each district has the option to adopt any textbook from the list or choose not to adopt at all (i.e., not use textbooks on the list). In math, the adoption process typically moves in sync with the state&#x00027;s initiation. The authors of the previous study focused on elementary math textbooks adopted in California in the fall of 2008 and 2009. The data were derived from schools&#x00027; 2013 School Accountability Report Cards (SARC)<xref ref-type="fn" rid="fn0006"><sup>4</sup></xref>.</p>
<p><xref ref-type="table" rid="T2">Table 2</xref> provides an overview of the descriptive statistics for the four math textbooks examined by Koedel et al. The test score values presented in this table are the average standardized data at both the school and district levels. Specifically, the &#x0201C;California Math&#x0201D; column identifies schools that adopted California Math with characteristics from either 2007 or 2008 and had at least one Grade 3 test score from 2009 to 2013<xref ref-type="fn" rid="fn0007"><sup>5</sup></xref>. The &#x0201C;Composite Alternative&#x0201D; column, on the other hand, displays the average values of schools that adopted the three other textbooks<xref ref-type="fn" rid="fn0008"><sup>6</sup></xref>. It is important to note that the outcome variable Y in this study is the Grade 3 math score, while all other characteristic variables, except for the Grade 3 ELA score, are represented by the vector X. Additionally, the school-level data are exclusively utilized in this study.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Descriptive statistics for California math and composite alternative.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="center"><bold>California math</bold></th>
<th valign="top" align="center"><bold>Composite alternative</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="3"><bold>School outcomes</bold></td>
</tr>
<tr>
<td valign="top" align="left">Preadoption Grade 3 math score</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">&#x02212;0.07</td>
</tr>
<tr>
<td valign="top" align="left">Preadoption Grade 3 ELA score</td>
<td valign="top" align="center">0.07</td>
<td valign="top" align="center">&#x02212;0.08</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold>School characteristics</bold></td>
</tr>
<tr>
<td valign="top" align="left">%Female</td>
<td valign="top" align="center">48.9</td>
<td valign="top" align="center">48.6</td>
</tr>
<tr>
<td valign="top" align="left">%Economically disadvantaged</td>
<td valign="top" align="center">56</td>
<td valign="top" align="center">57.3</td>
</tr>
<tr>
<td valign="top" align="left">%English learner</td>
<td valign="top" align="center">28</td>
<td valign="top" align="center">30.1</td>
</tr>
<tr>
<td valign="top" align="left">%White</td>
<td valign="top" align="center">29.9</td>
<td valign="top" align="center">29.3</td>
</tr>
<tr>
<td valign="top" align="left">%Black</td>
<td valign="top" align="center">6.3</td>
<td valign="top" align="center">7.7</td>
</tr>
<tr>
<td valign="top" align="left">%Asian</td>
<td valign="top" align="center">7.2</td>
<td valign="top" align="center">7.7</td>
</tr>
<tr>
<td valign="top" align="left">%Other ethnicities</td>
<td valign="top" align="center">56.6</td>
<td valign="top" align="center">50.0</td>
</tr>
<tr>
<td valign="top" align="left">Enrollment</td>
<td valign="top" align="center">429.5</td>
<td valign="top" align="center">401.5</td>
</tr>
<tr>
<td valign="top" align="left">2008 adopter</td>
<td valign="top" align="center">53.7</td>
<td valign="top" align="center">48.6</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold>School-area characteristics (census)</bold></td>
</tr>
<tr>
<td valign="top" align="left">Median household income (log)</td>
<td valign="top" align="center">10.9</td>
<td valign="top" align="center">10.8</td>
</tr>
<tr>
<td valign="top" align="left">Share low education</td>
<td valign="top" align="center">19.3</td>
<td valign="top" align="center">20.0</td>
</tr>
<tr>
<td valign="top" align="left">Share missing census data</td>
<td valign="top" align="center">1.2</td>
<td valign="top" align="center">1.7</td>
</tr>
<tr>
<td valign="top" align="left">District outcomes</td>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Preadoption Grade 3 math score</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">&#x02212;0.01</td>
</tr>
<tr>
<td valign="top" align="left">Preadoption Grade 3 ELA score</td>
<td valign="top" align="center">&#x02212;0.12</td>
<td valign="top" align="center">&#x02212;0.06</td>
</tr>
<tr>
<td valign="top" align="left" colspan="3"><bold>District characteristics</bold></td>
</tr>
<tr>
<td valign="top" align="left">Enrollment</td>
<td valign="top" align="center">6075.5</td>
<td valign="top" align="center">16022.9</td>
</tr>
<tr>
<td valign="top" align="left">n(Schools)</td>
<td valign="top" align="center">602</td>
<td valign="top" align="center">1276</td>
</tr>
<tr>
<td valign="top" align="left">n(Districts)</td>
<td valign="top" align="center">92</td>
<td valign="top" align="center">224</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>This is summarized from <xref ref-type="bibr" rid="B17">Koedel et al. (2017)</xref> <xref ref-type="table" rid="T1">Table 1</xref>, showing all variables we use. The second column is a weighted average of columns 4, 6, and 7 from <xref ref-type="bibr" rid="B17">Koedel et al. (2017)</xref> <xref ref-type="table" rid="T1">Table 1</xref>.</p>
</table-wrap-foot>
</table-wrap>
<p>In the study, the researchers used traditional techniques to estimate the effects of the four math textbooks on student achievement and pointed out the limitations of their modeling approach. They first calculated the propensity score for each observation using a probit model, and based on the propensity score, they selected a subset of the sample with common support. This meant that only observations from the same range of propensity scores were included in the modeling samples for both the treated and control groups. The researchers then applied kernel matching, restricted OLS, and residualized matching techniques separately to estimate the treatment effect of the four textbooks on student achievement.</p>
<p>Among the four commonly used elementary math textbooks studied, the California Math textbook published by Houghton Mifflin had the highest treatment effect and outperformed the other textbooks. The authors also noted that their restricted OLS model, which imposed a linear form, produced more statistically precise results but introduced bias into their estimation.</p>
<p>Upon discovering that the California Math textbook worked best among the four textbooks studied, the researchers conducted a quasi-experimental study to compare California Math with the composite alternative. They designated adopters of California Math as the treatment group and all other adopters of the three alternative textbooks as the control group.</p>
<p><xref ref-type="table" rid="T3">Table 3</xref> presents the treatment effects observed over 4 years following the adoption of the textbooks. For example, the Year 1 results provide a comparison between the academic performance of students who used the newly adopted textbooks solely in Grade 3. The Year 2 results, on the other hand, compare the performance of students who used the newly adopted textbooks in Grades 2 and 3.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Estimated effects of California math on grade 3 mathematics achievement relative to the composite alternative.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="center"><bold>Year 1</bold></th>
<th valign="top" align="center"><bold>Year 2</bold></th>
<th valign="top" align="center"><bold>Year 3</bold></th>
<th valign="top" align="center"><bold>Year 4</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="5"><bold>Treatment: California math; Control: composite alternative</bold></td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: Kernel matching</td>
<td valign="top" align="center">0.063 (0.054)</td>
<td valign="top" align="center">0.083<break/> (0.051)</td>
<td valign="top" align="center">0.061 (0.059)</td>
<td valign="top" align="center">0.070<break/> (0.059)</td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: Restricted OLS</td>
<td valign="top" align="center">0.050<sup>&#x0002A;&#x0002A;</sup> (0.019)</td>
<td valign="top" align="center">0.064<sup>&#x0002A;&#x0002A;</sup><break/> (0.023)</td>
<td valign="top" align="center">0.049<sup>&#x0002A;&#x0002A;</sup> (0.023)</td>
<td valign="top" align="center">0.058<sup>&#x0002A;&#x0002A;</sup><break/> (0.023)</td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: residualized matching</td>
<td valign="top" align="center">0.050<sup>&#x0002A;&#x0002A;</sup> (0.020)</td>
<td valign="top" align="center">0.065<sup>&#x0002A;&#x0002A;</sup><break/> (0.024)</td>
<td valign="top" align="center">0.052<sup>&#x0002A;&#x0002A;</sup> (0.024)</td>
<td valign="top" align="center">0.060<sup>&#x0002A;&#x0002A;</sup><break/> (0.026)</td>
</tr>
<tr>
<td valign="top" align="left">No. of districts/schools (California Math)</td>
<td valign="top" align="center">92/597</td>
<td valign="top" align="center">89/588</td>
<td valign="top" align="center">91/595</td>
<td valign="top" align="center">90/590</td>
</tr>
<tr>
<td valign="top" align="left">No. of districts/schools (composite alternative)</td>
<td valign="top" align="center">213/1143</td>
<td valign="top" align="center">214/1145</td>
<td valign="top" align="center">216/1146</td>
<td valign="top" align="center">213/1144</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p><sup>&#x0002A;&#x0002A;</sup><italic>p</italic> &#x0003C; 0.05.</p>
</table-wrap-foot>
</table-wrap>
<p>To ensure that the estimates have causal interpretations, the authors conducted falsification tests to justify the conditional independence assumption. They estimated two types of models. In the first model, the authors estimated the effects of the pre-adoption curriculum on students by using test scores from previous adoption years (i.e., 3 to 6 years before adoption) instead of scores after adoption. Schools adopting California Math were treated as the treatment group, and the composite alternative served as the control group. The results showed no effect on student scores for all pre-adoption years.</p>
<p>In the second model, the authors used Grade 3 English test scores as the outcome variable and estimated effects for all pre-adoption years, similar to the first model, and all 4 years after adoption, similar to the main model. They also found no effect. Therefore, the authors argue that the conditional independence assumption is satisfied.</p>
<p>The authors noted their surprise upon discovering that the treatment effects did not increase over time. They provided several potential explanations for this unexpected finding, including a moderate dosage effect, insufficient exposure time in earlier grades to have a significant impact on grade three test scores, and variations in the quality of curriculum materials across different grades.</p>
<p>Additionally, the authors acknowledged that the limitations of linear models may have contributed to the unexpected results. It is difficult to believe that linear models can accurately capture the true conditional expected function (CEF) or its non-parametric/generalized approximation. Furthermore, the California textbook adoption dataset contains a large number of variables that are not all discrete, making it impossible to construct a saturated model.</p>
<p>To address these concerns, the authors utilized the newly developed DML method to estimate textbook causal effects. DML allows for non-parametric estimation of the CEF, can handle both discrete and continuous variables, and is capable of dealing with high-dimensional nuisance parameters. Unlike traditional techniques, such as separately calculating propensity scores and using OLS regression or Kernel Matching, DML integrates all variables into a non-parametric model to obtain consistent treatment effect estimators under the CIA condition. In the next section, we will demonstrate how the use of DML leads to more accurate estimations of treatment effects compared to the original results.</p></sec>
<sec sec-type="results" id="s4">
<label>4</label>
<title>Results</title>
<p><xref ref-type="table" rid="T4">Tables 4</xref>&#x02013;<xref ref-type="table" rid="T7">7</xref> present the results of DML estimations for the student achievement effect of using California Math for 4 years after adoption. Each table shows the results for a different outcome variable, and the columns display six different machine learning methods used for obtaining the <italic>g</italic><sub>0</sub> and <italic>m</italic><sub>0</sub> estimators.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Effect of California math after 1 year of adoption.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Estimates</bold></th>
<th valign="top" align="center"><bold>Lasso</bold></th>
<th valign="top" align="center"><bold>Reg.Trees</bold></th>
<th valign="top" align="center"><bold>Forest</bold></th>
<th valign="top" align="center"><bold>Boosting</bold></th>
<th valign="top" align="center"><bold>Nnet</bold></th>
<th valign="top" align="center"><bold>Ensemble</bold></th>
<th valign="top" align="center"><bold>Best</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="8"><bold>A. Interactive model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.040</td>
<td valign="top" align="center">0.039</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.094</td>
<td valign="top" align="center">0.065</td>
<td valign="top" align="center">0.084</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.040</td>
<td valign="top" align="center">0.025</td>
<td valign="top" align="center">0.034</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.008</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.008</td>
<td valign="top" align="center">0.020</td>
</tr>
<tr>
<td valign="top" align="left">clustered.se</td>
<td valign="top" align="center">0.031</td>
<td valign="top" align="center">0.034</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.045</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.016</td>
</tr>
<tr>
<td valign="top" align="left" colspan="8"><bold>B. Partially linear model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.046</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.042</td>
<td valign="top" align="center">0.029</td>
<td valign="top" align="center">0.053</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.020</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.018</td>
</tr></tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Effect of California math after 2 years of adoption.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Estimates</bold></th>
<th valign="top" align="center"><bold>Lasso</bold></th>
<th valign="top" align="center"><bold>Reg.Trees</bold></th>
<th valign="top" align="center"><bold>Forest</bold></th>
<th valign="top" align="center"><bold>Boosting</bold></th>
<th valign="top" align="center"><bold>Nnet</bold></th>
<th valign="top" align="center"><bold>Ensemble</bold></th>
<th valign="top" align="center"><bold>Best</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="8"><bold>A. Interactive model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.068</td>
<td valign="top" align="center">0.083</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.065</td>
<td valign="top" align="center">0.098</td>
<td valign="top" align="center">0.068</td>
<td valign="top" align="center">0.065</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.032</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.028</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.036</td>
</tr>
<tr>
<td valign="top" align="left">Se</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.009</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.009</td>
<td valign="top" align="center">0.023</td>
</tr>
<tr>
<td valign="top" align="left">clustered.se</td>
<td valign="top" align="center">0.026</td>
<td valign="top" align="center">0.027</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.039</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.018</td>
</tr>
<tr>
<td valign="top" align="left" colspan="8"><bold>B. Partially linear model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.059</td>
<td valign="top" align="center">0.062</td>
<td valign="top" align="center">0.074</td>
<td valign="top" align="center">0.076</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.081</td>
<td valign="top" align="center">0.080</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.050</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.024</td>
</tr>
<tr>
<td valign="top" align="left">Se</td>
<td valign="top" align="center">0.012</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.019</td>
</tr></tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Effect of California math after 3 years of adoption.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Estimates</bold></th>
<th valign="top" align="center"><bold>Lasso</bold></th>
<th valign="top" align="center"><bold>Reg.Trees</bold></th>
<th valign="top" align="center"><bold>Forest</bold></th>
<th valign="top" align="center"><bold>Boosting</bold></th>
<th valign="top" align="center"><bold>Nnet</bold></th>
<th valign="top" align="center"><bold>Ensemble</bold></th>
<th valign="top" align="center"><bold>Best</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="8"><bold>A. Interactive model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.042</td>
<td valign="top" align="center">0.041</td>
<td valign="top" align="center">0.061</td>
<td valign="top" align="center">0.077</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.049</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.025</td>
<td valign="top" align="center">0.043</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.037</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.030</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.028</td>
</tr>
<tr>
<td valign="top" align="left">clustered.se</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.032</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.029</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.019</td>
</tr>
<tr>
<td valign="top" align="left" colspan="8"><bold>B. Partially linear model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.041</td>
<td valign="top" align="center">0.036</td>
<td valign="top" align="center">0.054</td>
<td valign="top" align="center">0.056</td>
<td valign="top" align="center">0.086</td>
<td valign="top" align="center">0.039</td>
<td valign="top" align="center">0.061</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.025</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.023</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.016</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.021</td>
<td valign="top" align="center">0.021</td>
</tr></tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Effect of California math after 4 years of adoption.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Estimates</bold></th>
<th valign="top" align="center"><bold>Lasso</bold></th>
<th valign="top" align="center"><bold>Reg.Trees</bold></th>
<th valign="top" align="center"><bold>Forest</bold></th>
<th valign="top" align="center"><bold>Boosting</bold></th>
<th valign="top" align="center"><bold>Nnet</bold></th>
<th valign="top" align="center"><bold>Ensemble</bold></th>
<th valign="top" align="center"><bold>Best</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="8"><bold>A. Interactive model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.037</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.039</td>
<td valign="top" align="center">0.055</td>
<td valign="top" align="center">0.087</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.038</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.049</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.028</td>
<td valign="top" align="center">0.059</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.046</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.009</td>
<td valign="top" align="center">0.017</td>
<td valign="top" align="center">0.024</td>
<td valign="top" align="center">0.010</td>
<td valign="top" align="center">0.034</td>
</tr>
<tr>
<td valign="top" align="left">clustered.se</td>
<td valign="top" align="center">0.044</td>
<td valign="top" align="center">0.035</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.031</td>
<td valign="top" align="center">0.018</td>
<td valign="top" align="center">0.019</td>
</tr>
<tr>
<td valign="top" align="left" colspan="8"><bold>B. Partially linear model</bold></td>
</tr>
<tr>
<td valign="top" align="left">ATE</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.028</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.050</td>
<td valign="top" align="center">0.058</td>
<td valign="top" align="center">0.058</td>
</tr>
<tr>
<td valign="top" align="left">se(median)</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.027</td>
<td valign="top" align="center">0.022</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.023</td>
<td valign="top" align="center">0.021</td>
</tr>
<tr>
<td valign="top" align="left">se</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.019</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.015</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.020</td>
<td valign="top" align="center">0.020</td>
</tr></tbody>
</table>
</table-wrap>
<p>For the &#x0201C;Lasso&#x0201D; method, all characteristic variables listed in <xref ref-type="table" rid="T1">Table 1</xref> are included along with six-order polynomials of school-level enrollment, six-order polynomials of district-level enrollment, and eight-order polynomials of income, along with all their second-order interaction terms. For all other methods, the variables are used in their original level without any interaction or powered terms.</p>
<p>In the &#x0201C;Reg.Trees&#x0201D; method, a single decision tree is fitted, and the hyperparameter is chosen using 2-fold cross-validation. The &#x0201C;Forest&#x0201D; method runs random forests and takes an average of over 1,000 trees. The &#x0201C;Boosting&#x0201D; method uses boosted regression trees with 2-fold cross-validation. The &#x0201C;Neural net&#x0201D; method uses two neurons and sets a logistic loss function for classification and a linear for regression. Finally, the &#x0201C;Ensemble&#x0201D; method combines &#x0201C;Lasso,&#x0201D; &#x0201C;Boosting,&#x0201D; &#x0201C;Random Forests,&#x0201D; and &#x0201C;Neural Net&#x0201D; and takes their average.</p>
<p>The last column, &#x0201C;best,&#x0201D; runs differently by selecting the method(s) that give the best estimates of <italic>g</italic><sub>0</sub> and <italic>m</italic><sub>0</sub> at each splitting time, and then using each selected method to estimate them separately. Therefore, DML may use different methods in <italic>g</italic><sub>0</sub> and <italic>m</italic><sub>0</sub> estimations in the &#x0201C;best&#x0201D; column.</p>
<p>These tables provide valuable insights into the performance of different machine learning methods in estimating the treatment effect of using California Math on student achievement, and can help inform future research and policy decisions.</p>
<p>Each table presents two sets of results, with Panel A displaying the general interactive DML model and Panel B displaying the partially linear DML model. For each model, an estimate of the average treatment effect (ATE) and its standard error are reported. The &#x0201C;se(median)&#x0201D; column reports standard errors using the median method to adjust for split variations<xref ref-type="fn" rid="fn0009"><sup>7</sup></xref>, while the &#x0201C;se&#x0201D; column reports the median standard error across the 10 splits.</p>
<p>DML standard errors, both &#x0201C;se(median)&#x0201D; and &#x0201C;se&#x0201D;, are calculated under the assumption of independent and identically distributed (i.i.d.) sampling. However, for the dataset used in this study, a clustered standard error is more appropriate. To address this, the authors employ a bootstrap procedure where the entire sample is resampled at the district level, with 50 bootstrapped samples obtained. For each of the 50 bootstrapped samples, a DML estimate is obtained, and the 50 estimates are used to calculate the standard deviation, resulting in clustered standard errors reported under the &#x0201C;Clustered.se&#x0201D; column in each result table.</p>
<p>In <xref ref-type="table" rid="T8">Table 8</xref>, we present the ensemble DML results for comparison purposes. The main purpose of using DML is to relax the linear setting and obtain a more general and representative model compared to partially linear models.</p>
<table-wrap position="float" id="T8">
<label>Table 8</label>
<caption><p>Effects of California math on grade 3 mathematics achievement relative to the composite alternative: compare with DML results.</p></caption>
<table frame="box" rules="all">
<thead>
<tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="center"><bold>Year 1</bold></th>
<th valign="top" align="center"><bold>Year 2</bold></th>
<th valign="top" align="center"><bold>Year 3</bold></th>
<th valign="top" align="center"><bold>Year 4</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="5"><bold>Treatment: California math; Control: Composite alternative</bold></td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: Kernel matching</td>
<td valign="top" align="center">0.063 (0.054)</td>
<td valign="top" align="center">0.083<break/> (0.051)</td>
<td valign="top" align="center">0.061 (0.059)</td>
<td valign="top" align="center">0.070<break/> (0.059)</td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: Restricted OLS</td>
<td valign="top" align="center">0.050<sup>&#x0002A;&#x0002A;</sup> (0.019)</td>
<td valign="top" align="center">0.064<sup>&#x0002A;&#x0002A;</sup> (0.023)</td>
<td valign="top" align="center">0.049<sup>&#x0002A;&#x0002A;</sup> (0.023)</td>
<td valign="top" align="center">0.058<sup>&#x0002A;&#x0002A;</sup> (0.023)</td>
</tr>
<tr>
<td valign="top" align="left">Treatment effect: interactive DML (Ensemble)</td>
<td valign="top" align="center">0.065<sup>&#x0002A;&#x0002A;</sup> (0.015)</td>
<td valign="top" align="center">0.068<sup>&#x0002A;&#x0002A;</sup> (0.019)</td>
<td valign="top" align="center">0.057<sup>&#x0002A;&#x0002A;</sup> (0.018)</td>
<td valign="top" align="center">0.057<sup>&#x0002A;&#x0002A;</sup> (0.018)</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p><sup>&#x0002A;&#x0002A;</sup><italic>p</italic> &#x0003C; 0.05.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec sec-type="discussion" id="s5">
<label>5</label>
<title>Discussion</title>
<p>The results from the interactive DML model show smaller clustered standard errors compared to Kernel Matching and OLS. DML fulfills its promise of being a more efficient non-parametric estimator due to its two core strategies: orthogonalization and sample splitting.</p>
<p>For the point estimates, the interactive DML results are quite similar to Kernel Matching in Year 1 and Year 3; however, the effects appear to be fairly stable over the years in interactive DML, as opposed to the increasing and decreasing trend observed in Kernel Matching. The DML results align with what is expected in reality, as a stable effect is more common than a fluctuating effect. Therefore, from both a realistic meaning and model setting perspective, the DML results provide a new insight into understanding the true effect.</p>
<p>Moreover, the difference between OLS and interactive DML in Year 1 highlights the performance differences between the two models. Although linear models can capture the general features of the data, their lack of flexibility and stability makes it worth running a new set of non-parametric models, such as DML.</p>
<p>It is important to acknowledge several limitations of this study. First, the causal interpretation of the results depends on the Conditional Independence Assumption holding. While <xref ref-type="bibr" rid="B17">Koedel et al. (2017)</xref> provided falsification tests to support this assumption, the current study does not offer additional validation procedures.</p>
<p>Second, the data structure creates challenges due to hierarchical dependence. Schools are nested within districts, which introduces complex correlation patterns. Although clustered bootstrap methods help address some of these dependencies, they may not fully account for all sources of correlation present in the data.</p>
<p>Finally, the findings come from the specific context of California&#x00027;s textbook adoption process during the period studied. The unique characteristics of this setting may affect results in ways that do not extend to other states, different grade levels, or subjects other than mathematics.</p></sec>
<sec sec-type="conclusion" id="s6">
<label>6</label>
<title>Conclusion</title>
<p>The evaluation of treatment effects is crucial in educational research, involving policies, textbooks, and teacher training <xref ref-type="bibr" rid="B12">Chingos and Whitehurst, (2012)</xref>. The most commonly used tool for this purpose is OLS, due to its simplicity and straightforwardness. However, in this study, we introduce a newly developed method called double/debiased machine learning (DML), which outperforms OLS and kernel matching, the second most popular tool, in providing statistically significant estimators of textbook performance.</p>
<p>We first provide a mathematical explanation of why DML is superior, and then compare our results to a previous study that used the same dataset. Our findings suggest that DML not only surpasses OLS in its ability to overcome linear restrictions but also outperforms Kernel matching in providing a more accurate estimator. Despite its advantages, there are some limitations to using DML. One such limitation is that DML requires larger sample sizes compared to the other two methods, and smaller samples can cause DML to fail. Additionally, DML requires significantly more computing power, which can take several days to process data on a personal computer.</p>
<p>Our study makes several contributions: first, we applied the DML method to evaluate textbooks and demonstrated its superiority over OLS and Kernel matching, serving as a template for future studies. Second, we addressed concerns raised by <xref ref-type="bibr" rid="B17">Koedel et al. (2017)</xref> and verified the limitations of linear models for the data, cautioning against potential issues in similar studies. Finally, we extended the built-in DML standard errors to account for clustering, which is a minor but useful contribution to the field of model application.</p></sec>
</body>
<back>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>The data that support the findings of this study are available from the corresponding author upon reasonable request.</p>
</sec>
<sec sec-type="author-contributions" id="s8">
<title>Author contributions</title>
<p>ZF: Writing &#x02013; review &#x00026; editing, Writing &#x02013; original draft. YB: Writing &#x02013; original draft, Writing &#x02013; review &#x00026; editing.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="ai-statement" id="s10">
<title>Generative AI statement</title>
<p>The author(s) declared that generative AI was not used in the creation of this manuscript.</p>
<p>Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.</p></sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Acito</surname> <given-names>F.</given-names></name></person-group> (<year>2023</year>). <article-title>&#x0201C;Ordinary least squares regression,&#x0201D;</article-title> in <source>Predictive Analytics with KNIME: Analytics for Citizen Data Scientists</source> (<publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>105</fpage>&#x02013;<lpage>124</lpage>. doi: <pub-id pub-id-type="doi">10.1007/978-3-031-45630-5_6</pub-id></mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Arumuru</surname> <given-names>L.</given-names></name> <name><surname>David</surname> <given-names>T. O.</given-names></name></person-group> (<year>2024</year>). <article-title>The impact of instructional resources on academic achievement: a study of library and information science postgraduates in Nigeria</article-title>. <source>Asian J. Inf. Sci. Technol.</source> <volume>14</volume>, <fpage>54</fpage>&#x02013;<lpage>60</lpage>. doi: <pub-id pub-id-type="doi">10.70112/ajist-2024.14.1.4259</pub-id></mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Batlle</surname> <given-names>P.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Hosseini</surname> <given-names>B.</given-names></name> <name><surname>Owhadi</surname> <given-names>H.</given-names></name> <name><surname>Stuart</surname> <given-names>A. M.</given-names></name></person-group> (<year>2025</year>). <article-title>Error analysis of kernel/GP methods for nonlinear and parametric PDEs</article-title>. <source>J. Comput. Phys.</source> <volume>520</volume>:<fpage>113488</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jcp.2024.113488</pub-id></mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bhatt</surname> <given-names>R. R.</given-names></name> <name><surname>Koedel</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>Large-Scale evaluations of curricular effectiveness. Educ</article-title>. <source>Eval. Policy Anal.</source> <volume>34</volume>, <fpage>391</fpage>&#x02013;<lpage>412</lpage>. doi: <pub-id pub-id-type="doi">10.3102/0162373712440040</pub-id></mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bhatt</surname> <given-names>R. R.</given-names></name> <name><surname>Koedel</surname> <given-names>C.</given-names></name> <name><surname>Lehmann</surname> <given-names>D.</given-names></name></person-group> (<year>2013</year>). <article-title>Is curriculum quality uniform? Evidence from Florida</article-title>. <source>Econ. Educ. Rev.</source> <volume>34</volume>, <fpage>107</fpage>&#x02013;<lpage>121</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.econedurev.2013.01.014</pub-id></mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Biau</surname> <given-names>G.</given-names></name></person-group> (<year>2012</year>). <article-title>Analysis of a random forests model</article-title>. <source>J. Mach. Learn. Res.</source> <volume>13</volume>, <fpage>1063</fpage>&#x02013;<lpage>1095</lpage>.</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Blazar</surname> <given-names>D.</given-names></name> <name><surname>Heller</surname> <given-names>B.</given-names></name> <name><surname>Kane</surname> <given-names>T. J.</given-names></name> <name><surname>Polikoff</surname> <given-names>M.</given-names></name> <name><surname>Staiger</surname> <given-names>D. O.</given-names></name> <name><surname>Carrell</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Curriculum reform in the common core Era: evaluating elementary math textbooks across six U.S. States</article-title>. <source>J. Policy Anal. Manag</source>. <volume>39</volume>, <fpage>966</fpage>&#x02013;<lpage>1019</lpage>. doi: <pub-id pub-id-type="doi">10.1002/pam.22257</pub-id></mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Blazar</surname> <given-names>D.</given-names></name> <name><surname>Heller</surname> <given-names>B.</given-names></name> <name><surname>Kane</surname> <given-names>T.</given-names></name> <name><surname>Polikoff</surname> <given-names>M.</given-names></name> <name><surname>Staiger</surname> <given-names>D.</given-names></name> <name><surname>Carrell</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <source>Learning by the Book: Comparing Math Achievement Growth by Textbook in Six Common Core States</source>. Center for Education Policy Research, Harvard University.</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Breznau</surname> <given-names>N.</given-names></name></person-group> (<year>2022</year>). <article-title>Integrating computer prediction methods in social science: a comment on Hofman et al. (2021)</article-title>. <source>Soc. Sci. Comput. Rev.</source> <volume>40</volume>, <fpage>844</fpage>&#x02013;<lpage>853</lpage>. doi: <pub-id pub-id-type="doi">10.1177/08944393211049776</pub-id></mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>X.</given-names></name></person-group> (<year>2007</year>). <article-title>&#x0201C;Large sample sieve estimation of semi-nonparametric models,&#x0201D;</article-title> in <source>Handbook of Econometrics</source> (<publisher-loc>New Haven, CT</publisher-loc>: <publisher-name>NewYork University Press</publisher-name>), <fpage>5549</fpage>&#x02013;<lpage>5632</lpage>.</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chernozhukov</surname> <given-names>V.</given-names></name> <name><surname>Chetverikov</surname> <given-names>D.</given-names></name> <name><surname>Demirer</surname> <given-names>M.</given-names></name> <name><surname>Duflo</surname> <given-names>E.</given-names></name> <name><surname>Hansen</surname> <given-names>C.</given-names></name> <name><surname>Newey</surname> <given-names>W.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Double/debiased machine learning for treatment and structural parameters</article-title>. <source>Econ. J.</source> <volume>21</volume>, <fpage>C1</fpage>&#x02013;<lpage>C68</lpage>. doi: <pub-id pub-id-type="doi">10.1111/ectj.12097</pub-id></mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Chingos</surname> <given-names>M. M.</given-names></name> <name><surname>Whitehurst</surname> <given-names>G. J.</given-names></name></person-group> (<year>2012</year>). <source>Choosing Blindly: Instructional Materials, Teacher Effectiveness, and the Common Core</source>. <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>Brookings Institution</publisher-name>.</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Efeizomor</surname> <given-names>R. O.</given-names></name></person-group> (<year>2023</year>). <article-title>A comparative study of methods of remedying multicolinearity</article-title>. <source>Am. J. Theor. Appl. Stat.</source> <volume>12</volume>, <fpage>87</fpage>&#x02013;<lpage>91</lpage>. doi: <pub-id pub-id-type="doi">10.11648/j.ajtas.20231204.14</pub-id></mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hadar</surname> <given-names>L. L.</given-names></name></person-group> (<year>2017</year>). <article-title>Opportunities to learn: mathematics textbooks and students&#x00027; achievements</article-title>. <source>Stud. Educ. Eval.</source> <volume>55</volume>, <fpage>153</fpage>&#x02013;<lpage>166</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.stueduc.2017.10.002</pub-id></mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Heckman</surname> <given-names>J. J.</given-names></name> <name><surname>Ichimura</surname> <given-names>H.</given-names></name> <name><surname>Todd</surname> <given-names>P.</given-names></name></person-group> (<year>1998</year>). <article-title>Matching as an econometric evaluation estimator. <italic>R</italic>ev</article-title>. <source>Econ. Stud.</source> <volume>65</volume>, <fpage>261</fpage>&#x02013;<lpage>294</lpage>. doi: <pub-id pub-id-type="doi">10.1111/1467-937X.00044</pub-id></mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="book"><person-group person-group-type="author"><name><surname>Hiabu</surname> <given-names>M.</given-names></name> <name><surname>Mammen</surname> <given-names>E.</given-names></name> <name><surname>Meyer</surname> <given-names>J. T.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Local linear smoothing in additive models as data projection,&#x0201D;</article-title> in <source>Foundations of Modern Statistics</source>, eds. D. Belomestny, C. Butucea, E. Mammen, E. Moulines, M. Rei&#x000DF;, and V. V. Ulyanov (<publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>197</fpage>&#x02013;<lpage>223</lpage>.</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koedel</surname> <given-names>C.</given-names></name> <name><surname>Li</surname> <given-names>D.</given-names></name> <name><surname>Polikoff</surname> <given-names>M. S.</given-names></name> <name><surname>Hardaway</surname> <given-names>T.</given-names></name> <name><surname>Wrabel</surname> <given-names>S. L.</given-names></name></person-group> (<year>2017</year>). <article-title>Mathematics curriculum effects on student achievement in California</article-title>. <source>AERA Open</source> <volume>3</volume>, <fpage>1</fpage>&#x02013;<lpage>22</lpage>. doi: <pub-id pub-id-type="doi">10.1177/2332858417690511</pub-id></mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Koedel</surname> <given-names>C.</given-names></name> <name><surname>Polikoff</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Big bang for just a few bucks: the impact of math textbooks in California</article-title>. <source>Evid. Speaks Rep.</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>7</lpage>.</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>C.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>U. K.</given-names></name></person-group> (<year>2021</year>). <article-title>Linear regression with many controls of limited explanatory power</article-title>. <source>Quant. Econ.</source> <volume>12</volume>, <fpage>405</fpage>&#x02013;<lpage>442</lpage>. doi: <pub-id pub-id-type="doi">10.3982/QE1577</pub-id></mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>F.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name></person-group> (<year>2024</year>). <article-title>A study on textbook use and its effects on students&#x00027; academic performance</article-title>. <source>Discip. Interdiscip. Sci. Educ. Res.</source> <volume>6</volume>:<fpage>4</fpage>. doi: <pub-id pub-id-type="doi">10.1007/978-3-031-52924-5</pub-id></mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>C.</given-names></name> <name><surname>Zhao</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name></person-group> (<year>2023</year>). <article-title>New tests for high-dimensional linear regression based on random projection</article-title>. <source>Stat. Sin.</source> <volume>33</volume>, <fpage>475</fpage>&#x02013;<lpage>498</lpage>. doi: <pub-id pub-id-type="doi">10.5705/ss.202020.0405</pub-id></mixed-citation>
</ref>
<ref id="B22">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Polikoff</surname> <given-names>M. S.</given-names></name></person-group> (<year>2015</year>). How well aligned are textbooks to the common core standards in mathematics? <italic>Am. Educ. Res</italic>. J. <volume>52</volume>, <fpage>1185</fpage>&#x02013;<lpage>1211</lpage>. doi: <pub-id pub-id-type="doi">10.3102/0002831215584435</pub-id></mixed-citation>
</ref>
<ref id="B23">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Polikoff</surname> <given-names>M. S.</given-names></name> <name><surname>Campbell</surname> <given-names>S. E.</given-names></name> <name><surname>Rabovsky</surname> <given-names>S.</given-names></name> <name><surname>Koedel</surname> <given-names>C.</given-names></name> <name><surname>Le</surname> <given-names>Q. T.</given-names></name> <name><surname>Hardaway</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>The formalized processes districts use to evaluate mathematics textbooks</article-title>. <source>J. Curric. Stud.</source> <volume>52</volume>, <fpage>451</fpage>&#x02013;<lpage>477</lpage>. doi: <pub-id pub-id-type="doi">10.1080/00220272.2020.1747116</pub-id></mixed-citation>
</ref>
<ref id="B24">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sievert</surname> <given-names>H.</given-names></name> <name><surname>van den Ham</surname> <given-names>A.-K.</given-names></name> <name><surname>Heinze</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Are first graders&#x00027; arithmetic skills related to the quality of mathematics textbooks? a study on students&#x00027; use of arithmetic principles</article-title>. <source>Learn. Ins.</source> <volume>71</volume>:<fpage>101401</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.learninstruc.2020.101401</pub-id></mixed-citation>
</ref>
<ref id="B25">
<mixed-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Slavin</surname> <given-names>R. E.</given-names></name> <name><surname>Lake</surname> <given-names>C.</given-names></name></person-group> (<year>2008</year>). <article-title>Effective programs in elementary mathematics: a best-evidence synthesis</article-title>. <source>Rev. Educ. Res.</source> <volume>78</volume>, <fpage>427</fpage>&#x02013;<lpage>515</lpage>. doi: <pub-id pub-id-type="doi">10.3102/0034654308317473</pub-id></mixed-citation>
</ref>
</ref-list>
<fn-group>
<fn fn-type="custom" custom-type="edited-by" id="fn0001">
<p>Edited by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2559406/overview">Gladys Sunzuma</ext-link>, Bindura University of Science Education, Zimbabwe</p>
</fn>
<fn fn-type="custom" custom-type="reviewed-by" id="fn0002">
<p>Reviewed by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/3199343/overview">Jamiu Idowu</ext-link>, University College London, United Kingdom</p>
<p><ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/3253395/overview">Munir Ahmad</ext-link>, Government of Pakistan, Pakistan</p>
</fn>
</fn-group>
<fn-group>
<fn id="fn0003"><label>1</label><p>Unconfoundedness means conditioning on <italic>X</italic>, the counterfactuals <italic>Y</italic>(0) and <italic>Y</italic>(1) are uncorrelated with treatment <italic>D</italic>. CIA means conditioning on <italic>X</italic> the choice of <italic>D</italic>, which is statistically independent with <italic>U</italic>, the model error term. They are similar concepts; but treated as same in the study.</p></fn>
<fn id="fn0004"><label>2</label><p>Because of the high dimensional nuisance parameter space, usually machine learning methods are used in estimating <italic>g</italic><sub>0</sub>, which can be regarded as an ML estimator.</p></fn>
<fn id="fn0005"><label>3</label><p><xref ref-type="bibr" rid="B11">Chernozhukov et al. (2018)</xref> made this claim in their study. They prove that if the specific machine learning method used in the model has this property, then <italic>b</italic><sup>&#x0002A;</sup> will converge as well as proved. They also show good simulation results. In practice, it is hard to find theoretical justifications for how each method converge, but DML has been shown to outperform just arbitrary picking some variables and running a simple regression.</p></fn>
<fn id="fn0006"><label>4</label><p>For more detailed information, please refer to <xref ref-type="bibr" rid="B17">Koedel et al. (2017)</xref>.</p></fn>
<fn id="fn0007"><label>5</label><p>All test scores mentioned here are standardized student test scores. The standardized test score is obtained from the universe sample data collected from California Department of Education (CDE).</p></fn>
<fn id="fn0008"><label>6</label><p>Namely, enVision Math California; California Mathematics: Concepts, Skills, and Problem Solving; California HSP Math.</p></fn>
<fn id="fn0009"><label>7</label><p>For median method details, please check <xref ref-type="bibr" rid="B11">Chernozhukov et al. (2018)</xref> definition (3.3).</p></fn>
</fn-group>
</back>
</article>