<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Environ. Sci.</journal-id>
<journal-title>Frontiers in Environmental Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Environ. Sci.</abbrev-journal-title>
<issn pub-type="epub">2296-665X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1252271</article-id>
<article-id pub-id-type="doi">10.3389/fenvs.2023.1252271</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Environmental Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Application of gradient boosting model to forecast corporate green innovation performance</article-title>
<alt-title alt-title-type="left-running-head">Zhang and Yin</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fenvs.2023.1252271">10.3389/fenvs.2023.1252271</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Zhang</surname>
<given-names>Jingyi</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2352529/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Yin</surname>
<given-names>Kedong</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2513937/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>School of Economics</institution>, <institution>Ocean University of China</institution>, <addr-line>Qingdao</addr-line>, <addr-line>Shandong</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>School of Management Science and Engineering</institution>, <institution>Shandong University of Finance and Economics</institution>, <addr-line>Jinan</addr-line>, <addr-line>Shandong</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Institute of Marine Economics and Management</institution>, <institution>Shandong University of Finance and Economics</institution>, <addr-line>Jinan</addr-line>, <addr-line>Shandong</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1260224/overview">Stefan Cristian Gherghina</ext-link>, Bucharest Academy of Economic Studies, Romania</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2370842/overview">Ali Shehadeh</ext-link>, Yarmouk University, Jordan</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1691237/overview">Michel Salomon</ext-link>, Universit&#xe9; Bourgogne Franche-Comt&#xe9;, France</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Jingyi Zhang, <email>jingyi.zhang@stu.ouc.edu.cn</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>16</day>
<month>10</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="ecorrected">
<day>09</day>
<month>06</month>
<year>2026</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>11</volume>
<elocation-id>1252271</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>07</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>02</day>
<month>10</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Zhang and Yin.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Zhang and Yin</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Corporate green innovation performance can serve as a critical tool for policymakers to identify the best practice and provide support to micro-entities in need. Accurate forecasting of corporate green innovation performance plays a vital role in innovation incentives by simulating the effects of regulations and strategies. Based on the data of China&#x2019;s A-share listed companies during 2010&#x2013;2020, this paper elaborates the gradient boosting algorithm to predict the corporate green innovation performance and compares the prediction results of the gradient boosting model with the linear model, the decision tree model, and the random forest model. Subsequently, it examines the effectiveness of the influencing factors related to the enterprise&#x2019;s internal driving mechanism and external policy pressure in promoting corporate green innovation performance. It finds that: 1) The gradient boosting model outperforms other methods in its predictive effect. 2) An enterprise&#x2019;s resource base is a critical factor influencing its green innovation activities, and in particular, the influence of financial indicators on corporate green innovation performance has a significant incentive effect, indicating that the impetus from enterprises&#x2019; internal driving mechanism is crucial for enterprises&#x2019; green transformation. 3) The effect of secondary indicators is heterogeneous. In the command-based environmental regulation tools, the administrative penalties can activate enterprises&#x2019; green innovation better than the approvals of Environmental Impact Assessment (EIA) documents for construction projects do; as for the incentive-based environmental regulation, investment in pollution control projects has an apparent inducing effect on the corporate green innovation performance, while the environmental tax presents an inverted U-shape, implying that overly stringent taxation crowds out the corporate green innovation performance. 4) Similarly, in the operating capacity indicators, the increasing operating income growth rate can trigger the improvement of green innovation performance; nevertheless, the total asset turnover ratio shows a suppressing effect. The key to promoting corporate green innovation performance lies in effectively regulating the enterprises&#x2019; internal driving mechanism and the rational choice of external policy tools. This study helps to prospectively identify how corporate green innovation performance changes and provides theoretical guidance and micro evidence for the policymakers on choosing environmental regulation tools and for enterprises on adjusting the resource bases.</p>
</abstract>
<kwd-group>
<kwd>green innovation performance</kwd>
<kwd>forecast</kwd>
<kwd>green patents</kwd>
<kwd>environmental regulation</kwd>
<kwd>gradient boosting model</kwd>
</kwd-group>
<counts>
<page-count count="18"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Environmental Policy and Governance</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>The relationship between environmental protection and economic development has become a global concern, along with the rising global temperature and frequent extreme weather events. As the world&#x2019;s second-largest economy and one of the largest carbon emitters, China is confronted with enormous environmental pressures. With the proposal of the new philosophy of innovative, coordinated, green, open, and shared development, the significance of greening and innovation has been formally established. In the meantime, the Chinese government has put forward the goals of achieving peak carbon emissions and carbon neutrality and pushed for the green transformation of economic and social development through innovation as the driving force of development, aiming at energy conservation and emission reduction and promoting high-quality economic development.</p>
<p>As the core carriers of social and economic wealth and simultaneously the claimants of natural resources, enterprises are the most critical factor in reconciling economic development and ecology (<xref ref-type="bibr" rid="B28">Li et al., 2019</xref>). Green innovation is an essential initiative for enterprises to reduce carbon emissions, decrease environmental vandalism, and establish a competitive advantage (<xref ref-type="bibr" rid="B2">Berrone et al., 2013</xref>). However, the measures adopted by the Chinese government to enhance environmental protection and pollution control, such as administrative means, taxation, and technological tools, have yet to induce green innovation among enterprises fully. In addition, studies on the economic impact of environmental policies are rarely conducted at the micro level, mostly based on the governmental and societal levels. However, the critical point of whether the green development concept proposed by the government can be transformed into policy dividends lies in the coping strategies of environmental pollution subjects (<xref ref-type="bibr" rid="B54">Zhang Q. et al., 2019</xref>). This study believes that activating corporate innovation performance relies not only on external regulation, such as technological tools, financial tools, or policy constraints and incentives but also on the driving force of the internal development demand of micro-enterprise. In fact, to substantially stimulate green innovation, we need to consider the internal driving mechanism to examine how enterprises promote green innovation activities to achieve a &#x201c;win-win&#x201d; situation regarding environmental protection and enterprise competitiveness.</p>
<p>Some scholars have systematically reviewed the factors influencing corporate green innovation capabilities (<xref ref-type="bibr" rid="B45">Triguero et al., 2013</xref>; <xref ref-type="bibr" rid="B23">Hojnik and Ruzzier, 2016</xref>), which, in summary, include both internal and external factors. Internal factors include corporate organizational structure, corporate culture, management systems, human resources, and so on. More studies have been conducted on corporate organizational structure involving corporate governance mechanisms, environmental quality management systems, and stakeholder pressure (<xref ref-type="bibr" rid="B56">Zhang Z. G. et al., 2019</xref>; <xref ref-type="bibr" rid="B48">Wang et al., 2021</xref>). Besides, <xref ref-type="bibr" rid="B16">Hart (1995)</xref> considered the organizational capability of enterprises as a fundamental guarantee for implementing green innovation. Recently, from the perspective of executives&#x2019; human resource characteristics, some scholars considered that executives&#x2019; green experience, academic experience, and military experience contribute to corporate green innovation (<xref ref-type="bibr" rid="B9">Cho et al., 2017</xref>; <xref ref-type="bibr" rid="B29">Liu and Wang, 2021</xref>; <xref ref-type="bibr" rid="B31">Lu and Jiang, 2022</xref>). External factors include policy background, market demand, financial support, and technological progress. Extensive studies have focused on the environmental regulation factor, namely command-based and incentive-based environmental regulation (<xref ref-type="bibr" rid="B47">Wang and Qi, 2016</xref>; <xref ref-type="bibr" rid="B15">Guo, 2019</xref>; <xref ref-type="bibr" rid="B12">Duan and Xu, 2021</xref>). Studies on the impact of influencing factors have not reached a consensus. For example, some scholars clarified that government subsidies have an enhancing effect on green innovation, while others argued that a &#x201c;crowding out effect&#x201d; may exist (<xref ref-type="bibr" rid="B27">Li and Xiao, 2020</xref>; <xref ref-type="bibr" rid="B51">Zhang and Zhao, 2022</xref>); <xref ref-type="bibr" rid="B50">Wang and Wang (2021)</xref> suggested that the effective combination of green finance and green innovation is an essential driving force to achieve green development, nevertheless, the implementation of the <italic>Green Credit Guidelines</italic> policy did not significantly improve the quality of green innovation.</p>
<p>Research on corporate green innovation performance prediction is still in the primary stage, with an incomplete theoretical framework and an unformed methodological system. In enterprise-level innovation prediction, <xref ref-type="bibr" rid="B49">Wang and Chien (2006)</xref> applied neural network models to forecast enterprises&#x2019; innovation performance. <xref ref-type="bibr" rid="B8">Chien et al. (2010)</xref> used an adaptive neuro-fuzzy inference system (ANFIS) based on a neural network modeling algorithm to predict innovation performance through technological information resources and innovation objectives. <xref ref-type="bibr" rid="B22">Ho and Tsai (2011)</xref> used structural equation modeling (SEM) and adaptive ANFIS to predict the effect of value innovation and new product development (NPD) quality on NPD performance. They believed that ANFIS models could predict better than the SEM. The accuracy of these methods outperforms traditional statistical forecasting models, portraying the non-linear characteristics of corporate innovation activities. Moreover, scholars have provided empirical evidence for this hypothesis at the regional (<xref ref-type="bibr" rid="B18">Hajek and Henriques, 2017</xref>) and national levels (<xref ref-type="bibr" rid="B10">De la Paz-Marin et al., 2012</xref>). Regarding innovation systems, <xref ref-type="bibr" rid="B41">Samara et al. (2012)</xref> developed an integrated system dynamics approach to analyze the impact of innovation policies on the performance of national innovation systems. <xref ref-type="bibr" rid="B17">Hajek et al. (2019)</xref> used a predictive model based on genetic programming variants to predict regional innovation performance, including the number of patents, technological and non-technological innovation activities, and the economic impact of innovation. Current research on innovation forecasting at the firm level is scarce. In addition, no study has yet used machine learning methods to forecast corporate green innovation performance and to examine the effects of corporate financial capability and environmental regulations on corporate green innovation performance. The main concern in measuring corporate green innovation performance lies in the complexity of corporate green innovation systems, characterized by non-linearity and high variance, and therefore, finding an accurate and reliable forecasting tool to support decision-making is a challenging task. To solve this problem, we adopt the gradient ascent model to predict the green innovation performance of Chinese enterprises. Compared with traditional statistical prediction methods and other machine learning approaches, the gradient ascent algorithm can eliminate the need for complex mathematical representations of the input-output relationship, and it is more advantageous in modeling datasets with high variance and intrinsic nonlinear characteristics.</p>
<p>From the perspective of financial development, the prediction of corporate green innovation performance helps to guide the flow of financial and social capital from the heavily polluting sector to the green transformation sector, which not only strengthens the efficiency of using green innovation resources but also reinforces the monitoring function of financial institutions and the community on the debtors, thus boosting their efficiency in environmental responsibility. For policymakers, the prediction helps them to pinpoint each enterprise&#x2019;s green innovation capability level and to implement specific policies prospectively and pertinently. From the view of enterprise innovation, enterprises can, on the one hand, assess their innovation performance and competitive position in advance and thereby adjust their green strategy in time; on the other hand, identify the leading green innovation enterprises within and outside the industry, learn from their development experience and inspire their sense of social responsibility and motivation for green innovation. For investors, scientific forecasting can help improve corporate information transparency and anticipate corporate green development&#x2019;s prospects and potential risks for better decision-making. In summary, this research focuses on building a scientific and practical prediction model of corporate green innovation performance, clarifying the effectiveness and heterogeneity of each influencing factor, and fully exploiting the synergy and complementary effects of each driving factor to enhance corporate green innovation performance.</p>
<p>Current research has limitations: 1. Most existing literature focuses on the relationship between a single policy shock and corporate green innovation. However, few integrate the intrinsic driving mechanism and external policy constraints into a unified research framework to compare the different induced outcomes of green innovation, and even fewer examine the implementation effects for developing countries, which may lead to bias in the study of the corporate green innovation incentives. 2. Heterogeneity characteristics may lead to differences in the sensitivity of environmental pollution subjects to specific tools under the same category of influence factors, while most studies applied a &#x201c;one size fits all&#x201d; type of indicator for analysis, failing to capture the effect of heterogeneous tools. 3. Most of the studies on financial performance and environmental protection have only discussed the unidirectional effect of enterprises&#x2019; environmental responsibility on financial performance, and the analysis of the financial performance factors on corporate green innovation performance is sparse, with only a tiny part of the literature (<xref ref-type="bibr" rid="B54">Zhang Q. et al., 2019</xref>; <xref ref-type="bibr" rid="B42">Sheng et al., 2019</xref>; <xref ref-type="bibr" rid="B52">Zhang Chi, 2020</xref>; <xref ref-type="bibr" rid="B34">Meng et al., 2023</xref>) discussing financial performance as a mediating variable in its effect on enterprises&#x2019; environmental responsibility. 4. Existing studies of this type are conducted through econometric models and rarely applied non-linear algorithms such as machine learning, ignoring the capture of non-linear relationships between relevant variables and green technology innovation and the excavation of the underlying mechanisms, in which case may result in biased economic consequences of environmental regulation and the exploration of driving factors of green innovation.</p>
<p>Using the historical data related to enterprises&#x2019; resource bases and the current environmental regulations, the models in this study accurately predict the green patents obtained by the enterprise in the current year, including green inventions independently obtained in the year, green utility models independently obtained in the year, green inventions jointly obtained in the year, and green utility models jointly obtained in the year. It clarifies the different driving effects of the influencing factors, aiming to provide a theoretical basis and empirical reference for enterprises, investors, and policymakers. The main contributions of this paper are as follows.</p>
<p>First, it provides new evidence for the debate on whether each influencing factor crowds out or triggers corporate green innovation capability. It suggests how to motivate enterprises to change their green development mindset and actively carry out green innovation activities. From the perspective of the heterogeneity of the enterprises&#x2019; internal driving mechanism and external environmental regulation tools, this paper finds that what plays an inducing role in the inspiration of corporate green innovation performance is the superior profitability and solvency, the growth rate of operating income, administrative penalties, and investment in pollution control projects, rather than the high total asset turnover rate, the approvals of EIA documents for construction projects and the overly stringent environmental protection tax.</p>
<p>Second, it breaks the current microscopic research based on a single environmental regulation policy shock and improves the influencing factors research of corporate green innovation performance based on the internal driving mechanism and external environmental regulations. It provides theoretical guidance for the current environmental regulation policy decision and the coping strategy of enterprises. This paper suggests that the government should accurately position enterprises, make full use of the &#x201c;push-back&#x201d; effect of administrative penalties, and strengthen the enforcement of command-based regulations; meanwhile, the government should strengthen incentives and support for enterprises&#x2019; green innovation activities by increasing investment in environmental governance. Enterprises should fully utilize their resources and actively take responsibility for environmental protection to achieve the &#x201c;double dividend&#x201d; of enterprise competitiveness and environmental protection.</p>
<p>Third, expanding the previous research methods, this study applies the machine learning algorithm to construct an effective prediction model, providing a more practical research method for predicting green innovation performance. Subsequently, based on the gradient ascent algorithm, the non-linear relationships between the influencing factors and the corporate green innovation performance are explored through relative importance analysis and partial dependence function, deconstructing the black box of green innovation incentive research and expanding the thinking and method of green innovation research in China.</p>
<p>This study is structured as follows. In the introduction, we review the relevant literature and list the main contributions of this study. In <xref ref-type="sec" rid="s2">Section 2</xref>, we elaborate on the theoretical foundations and formulate the research hypotheses. <xref ref-type="sec" rid="s3">Section 3</xref> presents the research design. <xref ref-type="sec" rid="s4">Section 4</xref> discusses the empirical results and analyses. We empirically predict corporate green innovation performance and analyze the effect of corporate financial capability and environmental regulations on promoting corporate green innovation performance and heterogeneity. <xref ref-type="sec" rid="s5">Section 5</xref> summarizes the research conclusions and the implications for effectively incentivizing corporate green innovation.</p>
</sec>
<sec id="s2">
<title>2 Theoretical foundations and research hypotheses</title>
<sec id="s2-1">
<title>2.1 Corporate financial capability and corporate green innovation performance</title>
<p>Corporate financial capability refers to the ability of an enterprise to have controllable financial resources (<xref ref-type="bibr" rid="B55">Zhang, 2003</xref>). Based on the narrow definition, corporate financial capability refers to the ability of corporate financial performance (<xref ref-type="bibr" rid="B1">Beaver, 1996</xref>; <xref ref-type="bibr" rid="B30">Liu, 2016</xref>), including an enterprise&#x2019;s profitability, solvency, development capability, and operating capacity. According to the research purpose, enterprise financial capability in this study refers to enterprise financial capability in the narrow sense.</p>
<p>Based on the theory of redundant resources, whether an enterprise engages in social responsibility, including environmental protection, depends mainly on its ability to deploy sufficient redundant resources (<xref ref-type="bibr" rid="B38">Preston and Obannon, 1997</xref>; <xref ref-type="bibr" rid="B5">Campbell, 2007</xref>). The corporate financial capacity directly determines the number of redundant resources available to the enterprise to meet its social responsibilities, such as environmental protection responsibilities. (<xref ref-type="bibr" rid="B46">Waddock and Graves, 1997</xref>), and affects the effectiveness of enterprises&#x2019; implementation of various environmental protection measures. In other words, corporate financial capability is the economic condition for enterprises to take environmental responsibility. Enterprises are more likely to engage in social responsibility by using their resources only when their development needs are met (<xref ref-type="bibr" rid="B38">Preston and Obannon, 1997</xref>). Under the financial constraint, investment in environmental governance is bound to have a crowding-out effect on productive investment in the short run (<xref ref-type="bibr" rid="B50">Wang and Wang, 2021</xref>), while enterprises with financial advantages will have more flexibility to invest in CSR-related activities and thus better take environmental responsibility (<xref ref-type="bibr" rid="B7">Cheng et al., 2014</xref>; <xref ref-type="bibr" rid="B52">Zhang, 2020</xref>). <xref ref-type="bibr" rid="B20">Hasan and Habib (2015)</xref> believed that in the long run, organizations with better financial capability have more redundant resources and are better able to absorb CSR-related investments and make cost adjustments, thus facilitating the assumption of social responsibility. The green innovation project is characterized by long cycles, significant investments, high risks, and considerable uncertainties regarding the transformation of innovation results and the generation of economic benefits. Therefore, the enthusiasm of enterprises to make green investments is vulnerable to their inherent resources. The willingness of enterprises with a weak resource base to engage in innovation activities is low (<xref ref-type="bibr" rid="B27">Li and Xiao, 2020</xref>). In summary, corporate financial performance can impact corporate green innovation performance.<list list-type="simple">
<list-item>
<p>(1) Profitability. <xref ref-type="bibr" rid="B32">Lu et al. (2014)</xref> pointed out that the business capacity of enterprises has a positive impact on the assumption of corporate social responsibility, including environmental responsibility. Strong and sustainable profitability can bring a steady flow of material resources to enterprises, and based on meeting their development needs, they can also have sufficient redundant resources for environmental responsibility.</p>
</list-item>
</list>
</p>
<p>H<sub>1-1</sub>: The profitability of enterprises positively affects corporate green innovation performance.<list list-type="simple">
<list-item>
<p>(2) Solvency. Enterprises with more robust solvency usually have more liquid assets and cash flow, more stable market shares, and a more vital ability to withstand risks (<xref ref-type="bibr" rid="B19">Hao, 2019</xref>). Enterprises with more robust solvency are more willing to assume social and environmental responsibilities (<xref ref-type="bibr" rid="B40">Ross, 1977</xref>). Due to the long payback period of environmental protection investment, as well as the high cost and investment risk, robust solvency can reduce the enterprise&#x2019;s debt repayment pressure and the possibility of re-financing, with a solid ability to withstand risks, which helps to alleviate the risk-taking of environmental protection investors, so that the enterprise is more willing to participate in environmental protection, thus enhancing the corporate green innovation performance.</p>
</list-item>
</list>
</p>
<p>H<sub>1-2</sub>: The solvency of enterprises positively affects corporate green innovation performance.<list list-type="simple">
<list-item>
<p>(3) Operating capacity. A more robust operating capacity means enterprises have more resources to deploy and allocate to fulfill environmental protection responsibilities. On the one hand, enterprises with more substantial operating capacities can fully use existing resources and transform them into available resources and cash flow faster (<xref ref-type="bibr" rid="B19">Hao, 2019</xref>). On the other hand, a more robust operating capacity can ensure the reasonable allocation of environmental protection and green innovation resources.</p>
</list-item>
</list>
</p>
<p>H<sub>1-3</sub>: The enterprise&#x2019;s operating capacity positively affects corporate green innovation performance.<list list-type="simple">
<list-item>
<p>(4) Development capacity. Compared with enterprises in the rapid growth stage, enterprises in the mature stage have a larger scale and higher profit margins and accumulate more redundant resources (<xref ref-type="bibr" rid="B20">Hasan et al., 2015</xref>). On the other hand, early-stage enterprises mainly use resources for production and sales, which will inevitably squeeze out a particular share of environmental protection investment. In addition, fast-growing enterprises tend to reduce their financial resources and willingness to invest in environmental protection due to their products or services&#x2019; good market prospects and rapidly increasing market competitiveness.</p>
</list-item>
</list>
</p>
<p>H<sub>1-4</sub>:The enterprise&#x2019;s development capacity has a negative effect on the corporate green innovation performance.</p>
</sec>
<sec id="s2-2">
<title>2.2 Environmental regulation and corporate green innovation performance</title>
<p>Command-based environmental regulation, characterized by solid deterrence and distinct signaling effects, refers to the government&#x2019;s setting of environmental protection standards and objectives through enacting laws or administrative rules and regulations. Incentive-based environmental regulation refers to the government&#x2019;s efforts to guide enterprises on green transformation through subsidies or financial, tax, and fee, forming a long-term mechanism of environmental protection incentives and constraints. The above two regulatory tools deliver policy orientation signals of different intensities to enterprises, affecting their perceptions of environmental pressures with different expected effects, thus influencing their green innovation decisions.<list list-type="simple">
<list-item>
<p>(1) Incentive-based environmental regulation and corporate green innovation performance.</p>
</list-item>
</list>
</p>
<p>Based on organizational legitimacy and resource base, compliance with environmental regulations is the foundation of enterprises, and scarce innovation resources and government policy support tend to flow to enterprises with a solid sense of social responsibility and active response to policy guidance (<xref ref-type="bibr" rid="B26">Li et al., 2018</xref>; <xref ref-type="bibr" rid="B11">Deng et al., 2021</xref>). The resource effect theory clarifies that the heterogeneity of scarce innovation resources and government support policies enterprises possess creates differences in their green innovation capabilities. It has been illustrated that policy support such as government subsidies provides a resource base for corporate green innovation, alleviates the financing dilemma of corporate green innovation, and reduces the cost of corporate green transformation (<xref ref-type="bibr" rid="B35">Montmarin and Herrera, 2015</xref>). Resource constraints and insufficient incentives limit corporate green innovation (<xref ref-type="bibr" rid="B33">Manso, 2011</xref>). Incentive-based environmental regulatory tools, such as investment in pollution control projects, can effectively help to overcome these difficulties and promote corporate investment in green innovation.</p>
<p>However, the neoclassical school argues that environmental regulations, such as pollution charges fees (environmental taxes since the year 2018), increase compliance costs and exacerbate the financial constraints of enterprises, crowding out green innovation resources (<xref ref-type="bibr" rid="B36">Petroni et al., 2019</xref>). Whereas green innovation requires long-term and substantial resource investments, decision-makers will reduce their green innovation investments due to enterprises&#x2019; short-term business performance and cash flow pressure.<list list-type="simple">
<list-item>
<p>(2) Command-based environmental regulation and corporate green innovation performance</p>
</list-item>
</list>
</p>
<p>According to Porter&#x2019;s hypothesis, appropriate environmental regulation has a push effect on corporate green innovation (<xref ref-type="bibr" rid="B37">Porter and Van der Linde, 1995</xref>). Confronting rigid command-based environmental regulation, firms tend to create more green innovations, thereby reducing environmental pollution, enhancing green competitiveness, and effectively circumventing environmental regulatory costs (<xref ref-type="bibr" rid="B2">Berrone et al., 2013</xref>).</p>
<p>However, suppose the cost of complying with government regulation is much lower than that of green innovation. In that case, enterprises may make environmental investments to meet government requirements rather than committing to green innovation. Namely, since the green innovation investment has a more extended payback period and more uncertainty, the incentive for enterprises to engage in green innovation will be suppressed if they can meet the government regulatory standards through environmental investment or direct environmental restoration (<xref ref-type="bibr" rid="B28">Li et al., 2019</xref>). For example, enterprises can obtain environmental impact assessment (EIA) document approvals for construction projects through other means, thus avoiding the pressure and cost of green innovation, which may crowd out some resources and incentives to engage in green innovation.</p>
<p>H<sub>2</sub>: Environmental regulation tools have heterogeneous effects on corporate green innovation performance.</p>
</sec>
</sec>
<sec sec-type="methods" id="s3">
<title>3 Methods</title>
<sec id="s3-1">
<title>3.1 Description of variables</title>
<sec id="s3-1-1">
<title>3.1.1 Corporate green innovation performance</title>
<p>Drawing on the study of <xref ref-type="bibr" rid="B39">Qi et al. (2018)</xref>, we manually searched the patents according to the IPC classification number based on the &#x201c;Green List of International Patent Classification&#x201d; published by the World Intellectual Property Organization (WIPO) in 2010, obtaining the statistics of green inventions independently acquired in the current year, green utility models independently acquired in the current year, green inventions jointly acquired in the current year, and green utility models jointly acquired in the current year. The sum of the above four indicators, minus the fixed effects, is the green patents indicator to measure the corporate green innovation performance.</p>
</sec>
<sec id="s3-1-2">
<title>3.1.2 Corporate financial capability</title>
<p>Based on the studies of scholars such as <xref ref-type="bibr" rid="B13">Fan and Lang (2007)</xref> and <xref ref-type="bibr" rid="B52">Zhang (2020)</xref> and referring to the provisions of the &#x201c;<italic>Enterprise Economic Efficiency Evaluation Index System (Implementation)</italic>&#x201d; and the &#x201c;<italic>Rules for Evaluating the Performance of State-owned Capital Funds</italic>&#x201d;, we categorize the enterprise financial data into two categories: financial indicators and operating capacity and select ten representative secondary indicators to characterize. Financial indicators include total assets, net fixed assets, total liabilities, paid-in capital or equity, total profit, and net profit, reflecting the enterprise&#x2019;s financial status and operating benefit and are used to assess the enterprise&#x2019;s profitability, solvency, and financial soundness. Operating capacity includes net cash flow from operating activities, the total annual market value of individual shares, the total asset turnover ratio, and the operating income growth rate, reflecting the enterprise&#x2019;s operating capacity and development potential. These indicators are used to assess operating efficiency and growth.</p>
</sec>
<sec id="s3-1-3">
<title>3.1.3 Environmental regulation</title>
<p>The environmental regulation policy is not only an arrangement for the government to restrict and regulate the behavior of enterprises but also a vital factor affecting corporate green innovation. Based on <xref ref-type="bibr" rid="B3">Bo et al.&#x2019;s (2018)</xref> and <xref ref-type="bibr" rid="B43">Tan and Xu&#x2019;s (2022)</xref> studies, we classify environmental regulation tools into command-based and incentive-based environmental regulation. Second, considering data availability and representativeness, we divided the primary indicators into heterogeneous secondary variables. Command-based environmental regulation includes the number of penalty decisions, the number of EIA document approvals for construction projects in the year, the number of the National People&#x2019;s Congress (NPC) proposals, and the number of Chinese People&#x2019;s Political Consultative Conference (CPPCC) proposals, reflecting the government&#x2019;s mandatory supervision and punishment measures. Incentive-based environmental regulation includes the investment in pollution control projects completed this year (Renminbi (RMB) million), the investment in industrial pollution control (RMB million), and pollution charges fees (environmental taxes since the year 2018), reflecting the support and incentives provided by environmental protection departments to enterprises. Subsequently, we concretely constructed heterogeneous environmental regulation research variables in the framework of command-based and incentive-based regulation tools with reference to the generally applicable composite index method. Secondary indicators with unit differences were standardized to obtain dimensionless variables.</p>
</sec>
<sec id="s3-1-4">
<title>3.1.4 Industry attributes</title>
<p>Considering that the green development of enterprises is significantly influenced by market competition, business conditions, environmental strategies, technology base, and policy background, we include industry attributes in the category of influencing factors. One-hot coding is a standard method for converting categorical variables into a binary vector representation. Specifically, for a categorical variable with n different values, One-hot coding creates a binary vector of length n. In this vector, only the positions corresponding to the values are 1, and all other positions are 0. In the data processing of this study, the machine learning algorithms and statistical models we apply cannot deal directly with the categorical variables (nominal variables) but require the inputs to be numerical data. Therefore, to transform the text data of industry codes into numerical data, we convert the categorical variables into binary vector representations using One-hot coding to facilitate algorithm processing and analysis.</p>
<p>The explanatory variables are listed in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Explanatory variables.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Primary variables</th>
<th align="center">Secondary variables</th>
<th align="center">Variable symbols</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td rowspan="6" align="center">Financial indicators</td>
<td align="center">Total assets</td>
<td align="center">TA</td>
</tr>
<tr>
<td align="center">Net Fixed Assets</td>
<td align="center">NFA</td>
</tr>
<tr>
<td align="center">Total liabilities</td>
<td align="center">TL</td>
</tr>
<tr>
<td align="center">Paid-in capital or equity</td>
<td align="center">PIC</td>
</tr>
<tr>
<td align="center">Total profit</td>
<td align="center">TP</td>
</tr>
<tr>
<td align="center">Net profit</td>
<td align="center">NP</td>
</tr>
<tr>
<td rowspan="4" align="center">Operating Capacity</td>
<td align="center">Operating Capacity Net Cash Flow from Operating Activities</td>
<td align="center">NCF</td>
</tr>
<tr>
<td align="center">Total annual market value of individual stocks</td>
<td align="center">TAMV</td>
</tr>
<tr>
<td align="center">Total Asset Turnover Ratio</td>
<td align="center">TAT</td>
</tr>
<tr>
<td align="center">Operating Income Growth Rate</td>
<td align="center">OIGR</td>
</tr>
<tr>
<td rowspan="4" align="center">Command-based environmental regulation</td>
<td align="center">Number of Penalty Decisions</td>
<td align="center">PD</td>
</tr>
<tr>
<td align="center">Number of EIA document approvals for construction projects in the current year</td>
<td align="center">EIADA</td>
</tr>
<tr>
<td align="center">Number of NPC proposals</td>
<td align="center">NNPC</td>
</tr>
<tr>
<td align="center">Number of CPPCC proposals</td>
<td align="center">NCPPCC</td>
</tr>
<tr>
<td rowspan="3" align="center">Incentive-based environmental regulation</td>
<td align="center">Pollution control projects completed investment in the current year (RMB million)</td>
<td align="center">PCP</td>
</tr>
<tr>
<td align="center">Investment in industrial pollution control (RMB million)</td>
<td align="center">IIPC</td>
</tr>
<tr>
<td align="center">pollution charges fees (environmental taxes since the year 2018)</td>
<td align="center">ET</td>
</tr>
<tr>
<td align="center">Industry attributes</td>
<td align="center">Industry code</td>
<td align="center">IC</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3-2">
<title>3.2 Description of the sample data</title>
<p>This paper selects all A-share listed companies in China during 2010&#x2013;2020 (the data after 2020 has yet to be published by <italic>China Environmental Yearbook</italic>) as the research object, and the sample contains 25,579 data. The data sources are as follows: (1) Financial capacity data were obtained from the <italic>China Stock Market and Accounting Research Database</italic> (CSMAR), with reference to the provisions of the &#x201c;<italic>Enterprise Economic Efficiency Evaluation Index System (Implementation)</italic>&#x201d; and the &#x201c;<italic>Rules for Evaluating the Performance of State-owned Capital Funds</italic>&#x201d;. (2) The environmental regulation indicators are derived from <italic>China Statistical Yearbook</italic>, <italic>China Environmental Yearbook</italic>, <italic>China Environmental Statistical Yearbook,</italic> and <italic>China Taxation Yearbook</italic> and compiled by manual calculation. As the measure of corporate green innovation performance, data on corporate green patents, including green inventions independently obtained in the year, green utility models independently obtained in the year, green inventions jointly obtained in the year, and green utility models jointly obtained in the year, are obtained from the <italic>Chinese Research Data Service (CNRDS)</italic> database. The classification follows the standard of <italic>the World Intellectual Property Office</italic>, which classifies according to the patent classification number. The remaining indicators are obtained from the <italic>China Stock Market and Accounting Research (CSMAE)</italic> database. The above raw data were screened as follows: (1) The samples of listed companies in the ST, PT, and financial categories were excluded; (2) The sample data of the Tibetan region were excluded due to the deficiency of fundamental indicators of environmental regulation; (3) Dealing with outliers. Outliers may harm the modeling. We use the deletion of extreme values of 1% and consider the fixed effects of individual enterprises to avoid the influence of outliers on the model; (4) Scaling of features. The independent variables in this paper have different characteristics, and their values have different ranges of variation, so they must be processed in a certain way. We use normalization to process the ranges of variation and distributions of the different characteristics to ensure that the variables have the same scaling.</p>
<p>
<xref ref-type="table" rid="T2">Tables 2</xref>&#x2013;<xref ref-type="table" rid="T4">4</xref> show the descriptive statistics of financial data, environmental regulation data, and the number of corporate green patents by industry.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Descriptive statistics of financial data of listed companies.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">Total assets</th>
<th align="center">Net fixed assets</th>
<th align="center">Total liabilities</th>
<th align="center">Paid-in capital or equity</th>
<th align="center">Total profit</th>
<th align="center">Net profit</th>
<th align="center">Operating capacity net cash flow from operating activities</th>
<th align="center">Total annual market value of individual stocks</th>
<th align="center">Total asset turnover ratio</th>
<th align="center">Operating income growth rate</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">count</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
</tr>
<tr>
<td align="center">mean</td>
<td align="right">6.3 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">3.88 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">5.35 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">1.78 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">1.3 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">1.02 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">1.78 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">0.654808</td>
<td align="right">6.43</td>
<td align="right">1.41 &#xd7; 10<sup>&#x2b;10</sup>
</td>
</tr>
<tr>
<td align="center">std</td>
<td align="right">8.05 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">2.35 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">7.42 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">1.32 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">1.19 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">9.36 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">2.52 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">0.561137</td>
<td align="right">8.47 &#xd7; 10<sup>&#x2b;02</sup>
</td>
<td align="right">5.41 &#xd7; 10<sup>&#x2b;10</sup>
</td>
</tr>
<tr>
<td align="center">min</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">&#x2212;2033024</td>
<td align="right">35,000,000</td>
<td align="right">&#x2212;4.9 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">&#x2212;4.7 &#xd7; 10<sup>&#x2b;10</sup>
</td>
<td align="right">&#x2212;4.8 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">0.000061</td>
<td align="right">&#x2212;1.00</td>
<td align="right">2.65 &#xd7; 10<sup>&#x2b;08</sup>
</td>
</tr>
<tr>
<td align="center">25%</td>
<td align="right">1.48 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">1.86 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">4.18 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">2.37 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">45,684,285</td>
<td align="right">36,520,743</td>
<td align="right">1,262,720</td>
<td align="right">0.34568</td>
<td align="right">&#x2212;1.13 &#xd7; 10<sup>-02</sup>
</td>
<td align="right">3.06 &#xd7; 10<sup>&#x2b;09</sup>
</td>
</tr>
<tr>
<td align="center">50%</td>
<td align="right">3.26 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">4.96 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">1.29 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">4.64 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">1.32 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">1.08 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">99,361,134</td>
<td align="right">0.543146</td>
<td align="right">1.20 &#xd7; 10<sup>-01</sup>
</td>
<td align="right">5.44 &#xd7; 10<sup>&#x2b;09</sup>
</td>
</tr>
<tr>
<td align="center">75%</td>
<td align="right">8.5 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">1.47 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">4.41 &#xd7; 10<sup>&#x2b;09</sup>
</td>
<td align="right">9.76 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">4.02 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">3.28 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">3.79 &#xd7; 10<sup>&#x2b;08</sup>
</td>
<td align="right">0.806326</td>
<td align="right">2.88 &#xd7; 10<sup>-01</sup>
</td>
<td align="right">1.07 &#xd7; 10<sup>&#x2b;10</sup>
</td>
</tr>
<tr>
<td align="center">max</td>
<td align="right">3.01 &#xd7; 10<sup>&#x2b;13</sup>
</td>
<td align="right">7.33 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">2.74 &#xd7; 10<sup>&#x2b;13</sup>
</td>
<td align="right">3.56 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">3.92 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">3.13 &#xd7; 10<sup>&#x2b;11</sup>
</td>
<td align="right">1.13 &#xd7; 10<sup>&#x2b;12</sup>
</td>
<td align="right">12.37286</td>
<td align="right">1.35 &#xd7; 10<sup>&#x2b;05</sup>
</td>
<td align="right">1.82 &#xd7; 10<sup>&#x2b;12</sup>
</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Descriptive statistics of environmental regulations of listed companies.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">Number of penalty decisions</th>
<th align="center">Number of EIA document approvals for construction projects</th>
<th align="center">Number of NPC proposals</th>
<th align="center">Number of CPPCC proposals</th>
<th align="center">Pollution control projects completed investment</th>
<th align="center">Investment in industrial pollution control</th>
<th align="center">Pollution charges fees (environmental taxes since the year 2018)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">count</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
<td align="right">25,579</td>
</tr>
<tr>
<td align="center">mean</td>
<td align="right">8,403.244</td>
<td align="right">18,227.19</td>
<td align="right">325.0637</td>
<td align="right">437.9783</td>
<td align="right">2,442,218</td>
<td align="right">309,012.5</td>
<td align="right">80,816.97</td>
</tr>
<tr>
<td align="center">std</td>
<td align="right">7,454.154</td>
<td align="right">16,097.69</td>
<td align="right">215.2112</td>
<td align="right">408.8827</td>
<td align="right">1,892,429</td>
<td align="right">253,113.9</td>
<td align="right">61,808.99</td>
</tr>
<tr>
<td align="center">min</td>
<td align="right">47</td>
<td align="right">64</td>
<td align="right">11</td>
<td align="right">11</td>
<td align="right">63,400</td>
<td align="right">3,576</td>
<td align="right">2,848.6</td>
</tr>
<tr>
<td align="center">25%</td>
<td align="right">2,413</td>
<td align="right">5,852</td>
<td align="right">115</td>
<td align="right">156</td>
<td align="right">1,010,000</td>
<td align="right">119,568</td>
<td align="right">36,399</td>
</tr>
<tr>
<td align="center">50%</td>
<td align="right">5,943</td>
<td align="right">13,304</td>
<td align="right">306</td>
<td align="right">443</td>
<td align="right">1,868,200</td>
<td align="right">264,812</td>
<td align="right">65,184</td>
</tr>
<tr>
<td align="center">75%</td>
<td align="right">12,054</td>
<td align="right">28,926</td>
<td align="right">485</td>
<td align="right">617</td>
<td align="right">3,367,127</td>
<td align="right">420,272</td>
<td align="right">94,477</td>
</tr>
<tr>
<td align="center">max</td>
<td align="right">45,140</td>
<td align="right">68,417</td>
<td align="right">1,196</td>
<td align="right">5,567</td>
<td align="right">12,627,300</td>
<td align="right">1,416,464</td>
<td align="right">358,888</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>The number of green patents obtained by listed companies by industries.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Industry code</th>
<th align="center">Industry name</th>
<th align="center">Count</th>
<th align="center">Mean</th>
<th align="center">Standard deviation</th>
<th align="center">Maximum</th>
<th align="center">Minimum</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">A</td>
<td align="center">Agriculture, forestry, animal husbandry and fishery</td>
<td align="center">372</td>
<td align="center">0.31</td>
<td align="center">1.95</td>
<td align="center">29</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">B</td>
<td align="center">Extractive industry</td>
<td align="center">616</td>
<td align="center">15.74</td>
<td align="center">92.69</td>
<td align="center">991</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">C</td>
<td align="center">Manufacturing</td>
<td align="center">16,344</td>
<td align="center">1.92</td>
<td align="center">10.67</td>
<td align="center">450</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">D</td>
<td align="center">Electricity, gas and water production and supply</td>
<td align="center">803</td>
<td align="center">1.97</td>
<td align="center">15.80</td>
<td align="center">341</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">E</td>
<td align="center">Construction</td>
<td align="center">678</td>
<td align="center">1.93</td>
<td align="center">4.65</td>
<td align="center">55</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">F</td>
<td align="center">Transportation, storage industry</td>
<td align="center">1,314</td>
<td align="center">0.02</td>
<td align="center">0.17</td>
<td align="center">3</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">G</td>
<td align="center">Information Technology Industry</td>
<td align="center">750</td>
<td align="center">0.05</td>
<td align="center">0.32</td>
<td align="center">5</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">H</td>
<td align="center">Wholesale and retail trade</td>
<td align="center">99</td>
<td align="center">0.00</td>
<td align="center">0.00</td>
<td align="center">0</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">J</td>
<td align="center">Real estate industry</td>
<td align="center">500</td>
<td align="center">0.45</td>
<td align="center">2.03</td>
<td align="center">20</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">K</td>
<td align="center">Social Services</td>
<td align="center">1,152</td>
<td align="center">0.02</td>
<td align="center">0.22</td>
<td align="center">5</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">L</td>
<td align="center">Communication and cultural industries</td>
<td align="center">281</td>
<td align="center">0.07</td>
<td align="center">0.44</td>
<td align="center">5</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">M</td>
<td align="center">Research and experimental development, professional and technical services, science and technology promotion and application services</td>
<td align="center">227</td>
<td align="center">1.80</td>
<td align="center">2.97</td>
<td align="center">21</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">N</td>
<td align="center">Ecological protection and environmental management industry, public facilities management industry</td>
<td align="center">254</td>
<td align="center">3.34</td>
<td align="center">7.16</td>
<td align="center">49</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">O</td>
<td align="center">Land management; residential services, repair, and other services</td>
<td align="center">19</td>
<td align="center">0.84</td>
<td align="center">2.06</td>
<td align="center">8</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">P</td>
<td align="center">Education</td>
<td align="center">14</td>
<td align="center">0.36</td>
<td align="center">1.34</td>
<td align="center">5</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">Q</td>
<td align="center">Health</td>
<td align="center">46</td>
<td align="center">0.00</td>
<td align="center">0.00</td>
<td align="center">0</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">R</td>
<td align="center">Journalism and publishing; radio, television, film, and video recording production; culture and art; sports</td>
<td align="center">299</td>
<td align="center">0.02</td>
<td align="center">0.17</td>
<td align="center">2</td>
<td align="center">0</td>
</tr>
<tr>
<td align="center">S</td>
<td align="center">Others</td>
<td align="center">266</td>
<td align="center">0.05</td>
<td align="center">0.40</td>
<td align="center">5</td>
<td align="center">0</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>In this study, we exclude the financial industry, represented by the industry code &#x201C;I&#x201D;, due to the specificity of its business strategies, main activities, and the structure and figures of its financial reports.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>From <xref ref-type="table" rid="T4">Table 4</xref>, the mean value of green patents is 1.85, the minimum is 0, and the maximum is 991, manifesting that the overall corporate green innovation performance is deficient and highly differentiated. The first quartile (Q1), the median (Q2), and the third quartile (Q3) are all 0, indicating that the dataset is unbalanced. The number of green patents in the ecological protection and environmental management industry and public facilities management industry is relatively evenly distributed, with the highest average of 3.34. The green innovation ability of the secondary industry, comprised of the extractive, manufacturing, energy, and construction industries, is remarkable, and the number of green patents obtained by a single firm in the extractive industry has reached as high as 991. The reason is that resource-intensive industries, confronted with burdensome social responsibilities and environmental pressures, have a greater sense of responsibility, technical support, and preferential policies for green transformation and industrial restructuring. Nevertheless, the development of green innovation performance in this industry is also uneven, as there are still enterprises with no patents obtained. The number of green patents in tertiary industries, such as the information technology industry, cultural communication, and social service industry, is relatively scarce, indicating that measures should be taken to optimize the industrial structure, provide green financial support, incubate green projects, improve the technical capacity of enterprises, and introduce high-precision enterprises into the tertiary industry to push forward the green transformation.</p>
<p>Due to the apparent long-tailed distribution of the target variable in the data set, we use the category quantile loss function to balance the weights of different categories in the data set to effectively alleviate the sample imbalance problem and improve the prediction performance for minority categories. The loss function is defined as:<disp-formula id="equ1">
<mml:math id="m1">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Based on the data distribution, this paper shows that the top 20% with less green innovation performance possesses 80% of the total sample, hence the <italic>q</italic> value is 0.2. <inline-formula id="inf1">
<mml:math id="m2">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the actual value, and <inline-formula id="inf2">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the predictive value. <inline-formula id="inf3">
<mml:math id="m4">
<mml:mrow>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mo>&#x2219;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the indicative function: <disp-formula id="equ2">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mo>&#x2219;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="" separators="|">
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">indicated&#x2009;conditions&#x2009;are&#x2009;met</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="italic">indicated&#x2009;conditions&#x2009;are&#x2009;not&#x2009;met</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</sec>
<sec id="s3-3">
<title>3.3 Model setting</title>
<sec id="s3-3-1">
<title>3.3.1 Linear regression model</title>
<p>A linear regression model predicts and explains by establishing a linear relationship between the independent variable (or explanatory variable) and the dependent variable. This model assumes a linear relationship between the independent and dependent variables, and the parameters are estimated using known sample data. The linear regression model can be used to predict new unknown sample data by obtaining the parameter values. The primary purpose of using the linear model in this paper is to help determine if there is a non-linear relationship between the independent and dependent variables by comparing the predicted results of the linear and non-linear models.<disp-formula id="equ3">
<mml:math id="m6">
<mml:mrow>
<mml:mi>Y</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3b5;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Where <italic>Y</italic> denotes the dependent variable, <inline-formula id="inf4">
<mml:math id="m7">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> refers to the independent variable, <inline-formula id="inf5">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3b8;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the model&#x2019;s parameters (also known as the regression coefficient), and <italic>&#x3b5;</italic> represents the random error. The regression coefficient indicates the degree of influence of the independent variable on the dependent variable. The model aims to find the optimal regression coefficient by minimizing the residuals (the difference between the predicted and actual values).</p>
</sec>
<sec id="s3-3-2">
<title>3.3.2 Decision tree model</title>
<p>In the decision tree to deal with the process of regression problems, its node splitting criterion and the generation of child nodes are similar, but the prediction of the leaf nodes is based on the sample point in the node on the average or other statistics to determine, rather than through the &#x201c;voting method&#x201d; decision.</p>
<p>Constructing a decision tree consists mainly of feature selection and the determination of splitting criteria. The decision tree can effectively divide the data set into different predictions by selecting the best feature attributes and appropriate splitting criteria. The decision tree structure consists of root nodes, internal nodes, and leaf nodes, where the root nodes contain the complete set of samples, the internal nodes represent the decision conditions, and the leaf nodes represent the final prediction results (<xref ref-type="bibr" rid="B4">Breiman et al., 1984</xref>).</p>
<p>Assuming that the input space is divided into <inline-formula id="inf6">
<mml:math id="m9">
<mml:mrow>
<mml:mi>M</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> cells, i.e., <inline-formula id="inf7">
<mml:math id="m10">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and there is a fixed output value <inline-formula id="inf8">
<mml:math id="m11">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> on each cell <inline-formula id="inf9">
<mml:math id="m12">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the regression tree model can be expressed as: <disp-formula id="equ4">
<mml:math id="m13">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf10">
<mml:math id="m14">
<mml:mrow>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the indicative function that takes one if <inline-formula id="inf11">
<mml:math id="m15">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and zero otherwise.</p>
<p>The nodes&#x2019; splitting criterion is to find the optimal cut-off point by minimizing the squared error. <inline-formula id="inf12">
<mml:math id="m16">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x22ef;</mml:mo>
<mml:mo>&#x22ef;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf13">
<mml:math id="m17">
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is the number of samples. We take the <italic>j</italic>th <inline-formula id="inf14">
<mml:math id="m18">
<mml:mrow>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mi>j</mml:mi>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> independent variable and the value <italic>s</italic>, which divides the region, as the cut-off variable and the cut-off point, respectively, and define two regions: <disp-formula id="equ5">
<mml:math id="m19">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mi>j</mml:mi>
</mml:msup>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ6">
<mml:math id="m20">
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>&#x7c;</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mi>j</mml:mi>
</mml:msup>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The optimal cut nodes are then found by minimizing the squared error: <disp-formula id="equ7">
<mml:math id="m21">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:msub>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>where <inline-formula id="inf15">
<mml:math id="m22">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> in the sample set is the <italic>i</italic>th independent variable, <inline-formula id="inf16">
<mml:math id="m23">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the <italic>i</italic>th dependent variable, and <inline-formula id="inf17">
<mml:math id="m24">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf18">
<mml:math id="m25">
<mml:mrow>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are set to be the mean of the corresponding output variables within each region: <disp-formula id="equ8">
<mml:math id="m26">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>c</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ9">
<mml:math id="m27">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>c</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>v</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</sec>
<sec id="s3-3-3">
<title>3.3.3 Random forest model</title>
<p>The Random Forest model is based on the idea of bagging, where multiple weak learners are trained by randomly dividing the subsamples and eventually integrated into one strong learner. Specifically, first, a certain number of samples are randomly selected from the training set to form a new training set, and different decision trees are built in parallel by randomly selecting feature subsets. Afterward, the above process is repeated with the same number of samples and features to create multiple decision trees, forming a random forest. Finally, their results are averaged or voted to conduct classification or regression prediction (<xref ref-type="bibr" rid="B6">Chen, 2021</xref>). Due to the high degree of randomness in the sub-sample extraction and feature space selection, the random forest better compensates for the deficiency of the inferior generalization ability of a single decision tree and, to some extent, solves the overfitting problem of the decision tree.</p>
<p>The functional equation of a random forest can be described as: <disp-formula id="equ10">
<mml:math id="m28">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Where <inline-formula id="inf19">
<mml:math id="m29">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, <italic>T</italic> is the number of decision trees, <inline-formula id="inf20">
<mml:math id="m30">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the predicted output of the random forest for the input sample <italic>x</italic>, <inline-formula id="inf21">
<mml:math id="m31">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the predicted output of the <italic>t</italic>th decision tree for the input sample <italic>x</italic>. <italic>T</italic> is the number of decision trees in the random forest.</p>
</sec>
<sec id="s3-3-4">
<title>3.3.4 Gradient boosting model</title>
<p>Based on the idea of boosting, the gradient boosting tree is an iterative integration algorithm consisting of multiple decision trees constructed from the original training set. The model works through numerous iterations, each of which produces a result in a decision tree, and each tree is trained on the residuals of the previous tree so that the new residuals are reduced in the gradient direction, making the predictions closer to the actual values (<xref ref-type="bibr" rid="B24">Jerome, 2001</xref>). In other words, the model achieves better prediction performance by iteratively creating weak learners (usually decision trees), training on the residuals of the previous one each time, and finally combining all the weak learners into one strong regressor.</p>
<p>Assume that the gradient boosting model attempts to estimate the objective function <inline-formula id="inf22">
<mml:math id="m32">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> to minimize the loss function <inline-formula id="inf23">
<mml:math id="m33">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>: <disp-formula id="equ11">
<mml:math id="m34">
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf24">
<mml:math id="m35">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. And <italic>T</italic> is the number of decision trees.</p>
<p>Then, the initial function is defined as:<disp-formula id="equ12">
<mml:math id="m36">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>arg</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:msub>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ13">
<mml:math id="m37">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>&#x3c1;</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf25">
<mml:math id="m38">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, <italic>N</italic> is the number of samples and <inline-formula id="inf26">
<mml:math id="m39">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the value taken at the <italic>i</italic>th sample point.</p>
<p>Refer to the idea of quantile loss functions to mitigate sample imbalance. The loss function is defined as: <disp-formula id="equ14">
<mml:math id="m40">
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2217;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Based on the data distribution, this paper shows that the top 20% with less green innovation performance possesses 80% of the total sample, hence the <italic>q</italic> value of 0.2. <inline-formula id="inf27">
<mml:math id="m41">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the actual value, and <inline-formula id="inf28">
<mml:math id="m42">
<mml:mrow>
<mml:msubsup>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the predictive value. <inline-formula id="inf29">
<mml:math id="m43">
<mml:mrow>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mo>&#x2219;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the indicative function: <disp-formula id="equ15">
<mml:math id="m44">
<mml:mrow>
<mml:msub>
<mml:mi>I</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mo>&#x2219;</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mfenced open="{" close="" separators="|">
<mml:mrow>
<mml:mtable columnalign="center">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>s</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>c</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>d</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>c</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>s</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>t</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Then, the negative gradient <inline-formula id="inf30">
<mml:math id="m45">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> of the loss function is calculated based on the following equation: <disp-formula id="equ16">
<mml:math id="m46">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mfenced open="[" close="]" separators="|">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>After deriving the negative gradient of the loss function, the model is refitted with a new regression tree <inline-formula id="inf31">
<mml:math id="m47">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, yielding: <disp-formula id="equ17">
<mml:math id="m48">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Finally, the newly generated regression tree is introduced into the objective function as follows: <disp-formula id="equ18">
<mml:math id="m49">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>arg</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:msub>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ19">
<mml:math id="m50">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf32">
<mml:math id="m51">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> can be interpreted as the learning rate that scales the decision trees added to the model (<xref ref-type="bibr" rid="B44">Tibshirani, 1996</xref>). <inline-formula id="inf33">
<mml:math id="m52">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the strong learner obtained by cumulatively computing the <italic>t</italic>th regression tree. The gradient boosting model can be generated by recomputing the negative gradient and looping for multiple rounds.</p>
</sec>
<sec id="s3-3-5">
<title>3.3.5 Partial dependence graph</title>
<p>Drawing on <xref ref-type="bibr" rid="B14">Friedman&#x2019;s (2001)</xref> study, we further construct partial dependence graphs based on the gradient boosting model to characterize the marginal effects of certain input variables, including corporate financial capability, environmental regulation, and industry attributes, on the output variables, namely corporate green innovation performance, in the gradient boosting model. Specifically, suppose that we predict the corporate green innovation performance based on the information set <inline-formula id="inf34">
<mml:math id="m53">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>P</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> containing <italic>P</italic> variables and finally generate the prediction function <inline-formula id="inf35">
<mml:math id="m54">
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>P</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf36">
<mml:math id="m55">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the partial derivative of the <italic>i</italic>th variable <inline-formula id="inf37">
<mml:math id="m56">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the <inline-formula id="inf38">
<mml:math id="m57">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> th variable, <inline-formula id="inf39">
<mml:math id="m58">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#xac;</mml:mo>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> refers to the variables other than the <italic>i</italic>th variable in the information set <inline-formula id="inf40">
<mml:math id="m59">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>P</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and <italic>N</italic> is the number of instances in the dataset. At this point, the partial dependence of variable <inline-formula id="inf41">
<mml:math id="m60">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> on <inline-formula id="inf42">
<mml:math id="m61">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>P</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is given by: <disp-formula id="equ20">
<mml:math id="m62">
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>P</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#xac;</mml:mo>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#xac;</mml:mo>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:msub>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>X</mml:mi>
<mml:mo>&#xac;</mml:mo>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>j</mml:mi>
<mml:mi>N</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mo>&#xac;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mi>j</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>In decision-tree-based estimation algorithms such as random forest and gradient boosting models, for different values of variable <inline-formula id="inf43">
<mml:math id="m63">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> taken, we can calculate the corresponding partial dependence level based on the sample mean to generate a partial dependence graph of variable <inline-formula id="inf44">
<mml:math id="m64">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. Based on this tool, the non-linear relationship between various indicators and corporate green innovation performance can be effectively depicted.</p>
</sec>
</sec>
<sec id="s3-4">
<title>3.4 Evaluation metrics</title>
<p>We use MAE, MSE, RMSE, and R-squared metrics to measure the accuracy and reliability of the prediction results. MAE (Mean Absolute Error) is the average of the absolute values of the differences between the predicted and actual values. It measures the average error size of the model, and a smaller MAE indicates that the model&#x2019;s prediction result is more accurate.<disp-formula id="equ21">
<mml:math id="m65">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mfenced open="|" close="|" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf45">
<mml:math id="m66">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> denotes the actual value, and <inline-formula id="inf46">
<mml:math id="m67">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the predictive value.</p>
<p>MSE (Mean Squared Error) is the average of the squares of the differences between the predicted and actual values. It also measures the magnitude of the model error but is more concerned with the effect of significant errors than the MAE. Therefore, the MSE is more sensitive than the MAE, and a smaller MSE indicates a more accurate prediction result from the model.<disp-formula id="equ22">
<mml:math id="m68">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>RMSE (Root Mean Squared Error) is the square root of the MSE. It retains the magnitude of the error and has the same units as the target variable, making it easier to understand. A smaller RMSE indicates a more accurate prediction by the model.<disp-formula id="equ23">
<mml:math id="m69">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>R-squared is used to measure the ability of a model to explain the variation in the data. Its value ranges from 0 to 1, and the closer it is to 1, the better the model fits the data. R-squared can help us determine whether the model is over- or under-fitted, as well as the reliability and stability of the model.<disp-formula id="equ24">
<mml:math id="m70">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
</sec>
</sec>
<sec id="s4">
<title>4 Results and analysis</title>
<sec id="s4-1">
<title>4.1 Prediction results and analysis</title>
<p>We conduct repeated experiments within the range of specific parameter values and compare experimental results to select the optimal results to obtain the model&#x2019;s parameter values. The adjusted model parameters mainly include the following:<list list-type="simple">
<list-item>
<p>(1) Learning rate: The learning rate controls how much each weak learner (base learner) contributes to the overall model. A lower learning rate makes the model more stable but may require more weak learners to perform better. Typically, the learning rate takes values between 0 and 1. For example, in a gradient boosting model, a new regression tree <inline-formula id="inf47">
<mml:math id="m71">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, yielding:</p>
</list-item>
</list>
<disp-formula id="equ25">
<mml:math id="m72">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>is introduced into the objective function as follows: <disp-formula id="equ26">
<mml:math id="m73">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>arg</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">min</mml:mi>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:msub>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ27">
<mml:math id="m74">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>where <inline-formula id="inf48">
<mml:math id="m75">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> can be interpreted as the learning rate that scales the decision trees added to the model (<xref ref-type="bibr" rid="B44">Tibshirani, 1996</xref>). <inline-formula id="inf49">
<mml:math id="m76">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>f</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> is the strong learner obtained by cumulatively computing the <italic>t</italic>th regression tree.<list list-type="simple">
<list-item>
<p>(2) The number of weak learners (n_estimators): The number of weak learners is an essential parameter in the gradient boosting model, which controls the number of decision trees, namely, the complexity of the model, and affects the training time of the model. Increasing the number of weak learners can make the model more complex and better fit the training data. Weak learners can provide more decision bounds or function approximation capabilities, improving the model&#x2019;s predictive performance. By iteratively adding more weak learners, the gradient boosting model can continuously improve and further reduce the training error. However, increasing the number of weak learners may also lead to overfitting problems and, thus, poor performance on new data. Therefore, a trade-off between model complexity and generalization performance is needed when choosing the number of weak learners.</p>
</list-item>
<list-item>
<p>(3) Maximum depth of trees (max_depth): The maximum depth of trees is another important parameter in the gradient boosting model, which controls the decision tree&#x2019;s growth depth and significantly impacts the model&#x2019;s complexity and generalization ability. A more considerable maximum depth of trees allows the decision tree algorithm to have more complex feature relationships, which can improve the model&#x2019;s ability to fit the training data. However, too large a maximum depth of trees can make the model too sensitive to noise and random variation, reducing the model&#x2019;s generalization ability.</p>
</list-item>
</list>
</p>
<p>Parameter selection is crucial. Taking the random forest and gradient boosting model as examples, we exhaustively search all possible parameter combinations within a given range and use cross-validation to evaluate the model&#x2019;s performance to select the best parameter combination that performs well on the test set. These parameters include the &#x201c;number of decision trees&#x201d; and the &#x201c;maximum depth of decision trees&#x201d;, ranging from 1 to 100. By analyzing the experimental results, we find that different numbers of decision trees and maximum depths significantly affect the model&#x2019;s performance. The results of the experimental images indicate that the performance of the gradient boosting model is optimal and stable when the number of decision trees and the maximum depth of decision trees reach about 20 (as shown in <xref ref-type="fig" rid="F1">Figures 1C, D</xref>). In the random forest model, the number of decision trees and the maximum depth of the decision trees reach approximately 30 before reaching the optimal performance and starting to stabilize (<xref ref-type="fig" rid="F1">Figures 1A, B</xref>).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>The number and the maximum depth of the decision trees in the random forest and gradient boosting models. <bold>(A)</bold> Random tree training data. <bold>(B)</bold> Random tree testing data. <bold>(C)</bold> Gradient boosting decision tree training data. <bold>(D)</bold> Gradient boosting decision tree testing data.</p>
</caption>
<graphic xlink:href="fenvs-11-1252271-g001.tif"/>
</fig>
<p>In this study, multiple linear regression, decision tree, random forest, and gradient boosting models are used to predict the changes in corporate green innovation performance, and the prediction results are compared. <xref ref-type="fig" rid="F2">Figures 2A&#x2013;H</xref> shows the fitting effectiveness of the actual and predicted corporate green innovation performance values under four modeling algorithms. Compared with the previous three methods, the gradient ascent algorithm further improves the prediction accuracy of enterprise green innovation performance, and the trend of the predicted value coincides with that of the actual value. <xref ref-type="table" rid="T5">Table 5</xref> lists the relevant evaluation indexes of each prediction model&#x2019;s fitting effectiveness to the test set, including the R-squared and three error evaluation indexes of MAE, MSE, and RMSE. The smaller the value of the error evaluation indexes, the closer the R-squared is to 1, indicating that the smaller the deviation of the predicted value of the model from the actual value, the higher the accuracy of the prediction model. The comparison of the evaluation indexes of the fitting effect reveals that the values of all the error evaluation indexes of the gradient ascent model are significantly lower than those of the previous three models, and the R-squared value is closer to 1. Taking the MSE (Mean Squared Error) as an example, the value of the index of the test set of the gradient ascent model is 10.51, which is much lower than that of the multivariate linear regression (48.28), decision tree (59.67) and random forest (12.13).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The fitting effectiveness of the actual and predicted corporate green innovation performance values under four modeling algorithms. <bold>(A)</bold> Linear regression training data. <bold>(B)</bold> Linear regression testing data. <bold>(C)</bold> Decision tree training data. <bold>(D)</bold> Decision tree testing data. <bold>(E)</bold> Random forest training data. <bold>(F)</bold> Random forest testing data. <bold>(G)</bold> Gradient boosting decision tree training data. <bold>(H)</bold> Gradient boosting decision tree testing data.</p>
</caption>
<graphic xlink:href="fenvs-11-1252271-g002.tif"/>
</fig>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Evaluation metrics of the models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th rowspan="2" align="left"/>
<th colspan="2" align="center">LR</th>
<th colspan="2" align="center">Decision tree</th>
<th colspan="2" align="center">Random forest</th>
<th colspan="2" align="center">GBDT</th>
</tr>
<tr>
<th align="left">Training</th>
<th align="left">Testing</th>
<th align="left">Training</th>
<th align="left">Testing</th>
<th align="left">Training</th>
<th align="left">Testing</th>
<th align="left">Training</th>
<th align="left">Testing</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">MAE</td>
<td align="right">1.48</td>
<td align="right">1.55</td>
<td align="right">0.94</td>
<td align="right">1.15</td>
<td align="right">0.87</td>
<td align="right">0.98</td>
<td align="right">0.97</td>
<td align="right">1.01</td>
</tr>
<tr>
<td align="center">MSE</td>
<td align="right">36.08</td>
<td align="right">48.28</td>
<td align="right">17.28</td>
<td align="right">59.67</td>
<td align="right">11.11</td>
<td align="right">12.13</td>
<td align="right">10.18</td>
<td align="right">10.51</td>
</tr>
<tr>
<td align="center">RMSE</td>
<td align="right">6.01</td>
<td align="right">6.95</td>
<td align="right">4.16</td>
<td align="right">7.72</td>
<td align="right">3.33</td>
<td align="right">3.48</td>
<td align="right">3.19</td>
<td align="right">3.24</td>
</tr>
<tr>
<td align="center">R-squared</td>
<td align="right">0.35</td>
<td align="right">&#x2212;0.05</td>
<td align="right">0.69</td>
<td align="right">&#x2212;0.30</td>
<td align="right">0.83</td>
<td align="right">0.46</td>
<td align="right">0.84</td>
<td align="right">0.54</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="fig" rid="F2">Figures 2C&#x2013;F</xref>, the prediction performance of the random forest algorithm is superior to that of the decision tree algorithm. A decision tree divides and predicts the data by constructing a tree structure as a primary classification and regression method. However, the decision tree model tends to focus excessively on noise and outliers in the training data, resulting in deficiencies in model generalization and, thus, inevitable overfitting problems. Random forest, as an integrated learning method, randomly selects a portion of features and samples for training when constructing each decision tree, thus avoiding overfitting and improving the accuracy and stability of the model. Therefore, compared to the individual decision tree, the random forest model has better generalization performance and stability to deal with complex classification and regression problems (<xref ref-type="bibr" rid="B21">Hastie et al., 2009</xref>; <xref ref-type="bibr" rid="B25">Kotsiantis, 2013</xref>).</p>
<p>The gradient boosting algorithm outperforms the former three algorithms. The reasons for this can be categorized into two aspects: 1. In statistical data applications, random forests are usually over-fitted for noisy data in classification or regression studies. The method may result in multiple similar decision trees for data sets with different characteristics, which may bias the research results. The gradient ascent model, based on gradient optimization, has high prediction accuracy and strong generalization ability, which can better adapt to the characteristics of the data and flexibly deal with both continuous and discrete data, especially when dealing with data sets with non-linear relationships. In addition, the gradient ascent model is more suitable for solving the non-linear regression problem, while the random forest is more suitable for solving the classification problem (<xref ref-type="bibr" rid="B6">Chen, 2021</xref>). 2. Unlike random forests, which generate decision trees in parallel, the gradient boosting model is an iterative integration algorithm where each tree is formed sequentially and trained on the residuals of the previous tree. In addition, while random forests take an undifferentiated approach to the training set, the gradient boosting model assigns weights to different decision trees according to their importance (<xref ref-type="bibr" rid="B24">Jerome, 2001</xref>). This iterative approach makes gradient-boosting trees usually outperform random forests regarding predictive performance.</p>
</sec>
<sec id="s4-2">
<title>4.2 Relative importance analysis</title>
<p>The relative importance analysis of the predictor variables (<xref ref-type="table" rid="T6">Table 6</xref>) indicates that the relative importance of financial indicators, operational capacity, command-based environmental regulation, incentive-based environmental regulation, and industry attributes is 40.76%, 24.61%, 14.69%, 12.72%, 7.23%, respectively. Specifically, operating income growth rate, paid-in capital or equity, total profit, industry attributes, and completed investment in pollution control projects in the current year occupy the top five positions. Overall, first, the relative importance of financial indicators and operational capacity for corporate green innovation performance is remarkable. Second, the relative importance of command-based environmental regulation on corporate green innovation performance is more pronounced than that of incentive-based environmental regulation. Third, there is heterogeneity in the impact of the secondary indicators of each influencing factor. Fourth, there is industry heterogeneity in corporate green innovation performance.</p>
<table-wrap id="T6" position="float">
<label>TABLE 6</label>
<caption>
<p>Relative importance analysis of the predictor variables.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center"/>
<th align="left"/>
<th colspan="2" align="center">Based on gradient boosting model</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">Primary variables</td>
<td align="center">Secondary variables</td>
<td align="center">Relative importance</td>
<td align="center">Ranking of relative importance</td>
</tr>
<tr>
<td rowspan="6" align="center">Financial indicators</td>
<td align="center">Total assets</td>
<td align="center">4.37%</td>
<td align="center">12</td>
</tr>
<tr>
<td align="center">Net Fixed Assets</td>
<td align="center">4.52%</td>
<td align="center">11</td>
</tr>
<tr>
<td align="center">Total liabilities</td>
<td align="center">6.83%</td>
<td align="center">6</td>
</tr>
<tr>
<td align="center">Paid-in capital or equity</td>
<td align="center">10.76%</td>
<td align="center">2</td>
</tr>
<tr>
<td align="center">Total profit</td>
<td align="center">10.62%</td>
<td align="center">3</td>
</tr>
<tr>
<td align="center">Net profit</td>
<td align="center">3.66%</td>
<td align="center">14</td>
</tr>
<tr>
<td rowspan="4" align="center">Operating Capacity</td>
<td align="center">Operating Capacity Net Cash Flow from Operating Activities</td>
<td align="center">3.32%</td>
<td align="center">15</td>
</tr>
<tr>
<td align="center">Total annual market value of individual stocks</td>
<td align="center">6.20%</td>
<td align="center">8</td>
</tr>
<tr>
<td align="center">Total Asset Turnover Ratio</td>
<td align="center">3.70%</td>
<td align="center">13</td>
</tr>
<tr>
<td align="center">Operating Income Growth Rate</td>
<td align="center">11.39%</td>
<td align="center">1</td>
</tr>
<tr>
<td rowspan="4" align="center">Command-based environmental regulation</td>
<td align="center">Number of Penalty Decisions</td>
<td align="center">6.20%</td>
<td align="center">7</td>
</tr>
<tr>
<td align="center">Number of EIA document approvals for construction projects in the current year</td>
<td align="center">2.23%</td>
<td align="center">16</td>
</tr>
<tr>
<td align="center">Number of NPC proposals</td>
<td align="center">4.92%</td>
<td align="center">9</td>
</tr>
<tr>
<td align="center">Number of CPPCC proposals</td>
<td align="center">1.34%</td>
<td align="center">17</td>
</tr>
<tr>
<td rowspan="3" align="center">Incentive-based environmental regulation</td>
<td align="center">Completed investment in pollution control projects in the current year (RMB million)</td>
<td align="center">7.04%</td>
<td align="center">5</td>
</tr>
<tr>
<td align="center">Investment in industrial pollution control (RMB million)</td>
<td align="center">4.64%</td>
<td align="center">10</td>
</tr>
<tr>
<td align="center">pollution charges fees (environmental taxes since the year 2018)</td>
<td align="center">1.04%</td>
<td align="center">18</td>
</tr>
<tr>
<td align="center">Industry attributes</td>
<td align="center">Industry code</td>
<td align="center">7.23%</td>
<td align="center">4</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-3">
<title>4.3 Partial dependency graph analysis</title>
<p>Based on the gradient boosting prediction model, we derive the partial dependency diagrams (<xref ref-type="fig" rid="F3">Figure 3</xref>), which reflect the non-linear relationship between each influencing factor and corporate green innovation performance.<list list-type="simple">
<list-item>
<p>(1) corporate green innovation performance is enhanced with the rise of enterprises&#x2019; financial indicator variables, whose secondary indicators, especially paid-in capital or equity and total profit, are at the top of the relative importance ranking. On the one hand, sound and sustainable profitability is the material basis and fundamental guarantee for enterprises to increase capital accumulation, expand cash flow, and increase enterprise value. Companies with higher profits tend to have more financial reserves that can be converted into high-quality innovation resources and can better invest human, material, and financial resources. On the other hand, solvency adequacy helps companies alleviate the pressure of debt repayment and refinancing caused by environmental investments. As environmental investments are enormous and have a long payback period, companies with adequate solvency are more resilient to risk and uncertainty and, therefore, more willing to fulfill their social responsibilities. In addition, according to the financial accelerator theory, companies with solid balance sheets can avoid financing constraints, expand financing channels, and thus obtain more financial support to promote green innovation.</p>
</list-item>
<list-item>
<p>(2) The impact of operating capacity indicators on corporate green innovation performance is heterogeneous, among which the positive driving effect of the operating income growth rate is the most prominent, and this indicator also ranks first in the relative importance ranking. First, enterprises with solid operating capacities usually perform well in coordination and optimization, thus ensuring the rational allocation and effective use of green innovation resources. Second, higher cash flow means more realizable assets and redundant resources to invest in environmental protection. Third, a high operating income growth rate means good business prospects, which raises market expectations and attracts high-quality green innovation resources, such as capital, talent, technology, and policy preference, for green innovation R&#x26;D.</p>
</list-item>
</list>
</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Partial dependency graph based on the gradient boosting prediction model.</p>
</caption>
<graphic xlink:href="fenvs-11-1252271-g003.tif"/>
</fig>
<p>However, as one of the manifestations of the development capacity of enterprises, the total asset turnover ratio has a negative driving effect. Enterprises with a high total asset turnover ratio have a pronounced growth capacity. They are in a period of rapid growth, focus more on their growth and expansion, and cannot devote sufficient resources to social responsibilities, including environmental protection (<xref ref-type="bibr" rid="B5">Cambell, 2007</xref>). Conversely, enterprises with a relatively low total asset turnover ratio may be mature and have greater flexibility to engage in environmental protection; Furthermore, due to their limited development capacity and the profit motive, they are more inclined to engage in green innovation reforms to reduce their dependence on the environment, improve the efficiency of resource use, and thus enhance their competitiveness.<list list-type="simple">
<list-item>
<p>(3) There is either an increase or a decrease in the corporate green innovation performance with the constraints of the command-based environmental regulation. The positive driving effect of environmental penalty decisions is the most pronounced. The reason is that, on the one hand, external stakeholders&#x2019; negative expectations and evaluations of the penalized enterprises lead to corporate financial losses, prompting managers to adopt green innovation strategies and rebuild their social image; on the other hand, the corporate economic losses resulting from the penalties force enterprises to compensate for their deficiencies, to improve the defects of corporate governance mechanisms and to produce more competitive green differentiated products.</p>
</list-item>
</list>
</p>
<p>The number of EIA document approvals for construction projects in the current year shows the opposite. To some extent, substitutability exists between the approval of EIA documents and the green patents obtained: By relying on the approval of environmental compliance by authorities, enterprises are bound to lose part of their innovation initiative. First, the cost of obtaining approvals may crowd out the green innovation inputs. Second, when the cost and the payback period of green investment are much higher than those of the EIA document approvals, the enterprise will need more incentive to improve its green innovation performance.<list list-type="simple">
<list-item>
<p>(4) The completed investment in pollution control projects in the current year and investment in industrial pollution control positively regulate corporate green innovation performance. On the one hand, pollution control investment stimulates enterprises to accelerate structural adjustment and promotes technological innovation and industrial upgrading; on the other hand, it guides the optimistic expectations of the market, attracts financial and private capital to flow into the green field, increases the accessibility of green financing for enterprises, and invigorates the development of green innovation.</p>
</list-item>
</list>
</p>
<p>The impact of the environmental tax shows an &#x201c;inverted U&#x2033; shape, which implies that rational intensity is the key to effectively carrying out the environmental tax. If the cost of environmental protection is much higher than that of the environmental tax, the tax will be ineffective in preventing the enterprises from environmental vandalism; on the other hand, overly stringent taxation may cause a deterioration in the enterprise&#x2019;s financial situation, resulting in a crowding-out effect on environmental investment behavior. In addition, it may trigger numerous enterprises&#x2019; relocation to avoid environmental pressures, thereby hindering regional economic development.</p>
<p>In this study, the above empirical results are significant for predicting and managing corporate green innovation performance in research and practice. Consistent with previous studies (<xref ref-type="bibr" rid="B15">Guo, 2019</xref>; <xref ref-type="bibr" rid="B27">Li and Xiao, 2020</xref>; <xref ref-type="bibr" rid="B52">Zhang, 2020</xref>), we found that the key to promoting corporate green innovation performance lies in effectively regulating the enterprises&#x2019; internal driving mechanism and rationally selecting external policy tools. Furthermore, this study not only offers a practical and effective corporate green innovation performance prediction model but also relies on a more diverse set of corporate financial indicators and environmental regulation tools to provide innovative empirical evidence for the debate on whether corporate financial capability and environmental regulation have a &#x201c;conflict&#x201d; or &#x201c;coordination&#x201d; effect on corporate green innovation performance and clarifies effective ways to incentivize corporate green innovation. Concretely, 1. Predict corporate green innovation performance. The gradient boosting model can help stakeholders accurately predict corporate green innovation performance through output indicators that reflect the green innovation performance of different enterprises. 2. Identify best practices. By comparing the green innovation performance of different enterprises and examining the relative importance and non-linear relationship of various influencing factors on corporate green innovation performance, decision makers such as governments and micro-entities can identify the best practices to optimize the top-level policy design and the resource base of enterprises and enhance the corporate green innovation performance. 3. Guide policy formulation. Accurate corporate green innovation performance prediction can provide an essential reference for policymakers to formulate targeted policies and interventions, thus realizing a &#x201c;win-win&#x201d; situation for environmental protection and enterprises&#x2019; competitiveness enhancement.4. Simulate the effect. The prediction model can simulate the effects of various influencing factors on corporate green innovation performance. By simulating the impact of different policies and strategies, we can better understand the results of each influencing factor on the green innovation performance of enterprises and provide a reference for decision-making by all parties.</p>
<p>We have attempted to introduce deep learning models, such as neural network models, for prediction and comparison in this study; however, as deep learning models are prone to overfitting problems, their predictions are not as effective as gradient boosting models and random forest models, and we initially speculate the reason is that the parameters have not been set optimally. In future studies, we will continue to refine our research on this issue and introduce deep learning models and other models into related research areas.</p>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusion and implications</title>
<p>This study uses machine learning algorithms to predict the green innovation performance of micro-entities, examines the effectiveness of internal driving mechanisms and external environmental regulation tools, and differentially empirically analyses the effects of heterogeneous corporate financial capabilities and environmental regulation tools on corporate green innovation performance in the Chinese context, providing valuable insights for the government to optimize the top-level design of policies and for enterprises to enhance their green competitiveness. The conclusions are as follows.</p>
<p>First, the gradient ascent algorithm can best predict corporate green innovation performance. Second, the relative importance of financial indicators and operating capacity is more prominent, and the non-linear influence of financial indicators on corporate green innovation performance has a significant positive incentive effect, indicating that the impetus from enterprises&#x2019; internal driving mechanism is crucial for enterprises&#x2019; green transformation. The relative importance of the industry attributes is noteworthy, implying that significant industry heterogeneity exists in the enterprises&#x2019; environmental strategy choices. Third, the effects of the operating capacity indicators in the internal driving mechanisms on the corporate green innovation performance show non-linearity and heterogeneity. The operating income growth rate presents a positive correlation trend, while the total asset turnover ratio has an inhibiting effect on the enhancement of green innovation performance, illustrating that enterprises in the rapid development period are prone to neglect the enhancement of green competitiveness. Fourth, similarly, regarding the command-based environmental regulation, the administrative penalty has an apparent inducing effect on the corporate green innovation performance, while the approvals of EIA documents for construction projects in the current year exhibit a crowding-out effect. In the incentive-based environmental regulation, the driving effect of the completed investment in pollution control projects and the investment in industrial pollution control is positive; nevertheless, the environmental tax indicator presents an inverted U-shape, implying that overly stringent environmental tax regulation may impede the development of corporate green innovation. Based on the research, we propose the following suggestions.<list list-type="simple">
<list-item>
<p>(1) To achieve incentives for corporate green innovation performance, the government should accurately position and formulate policies for each micro-entity according to the corporate green innovation performance prediction model and reasonably allocate innovation resources. First, reinforce the command-based policy control carried by laws and regulations, complemented by stringent enforcement, to achieve a fundamental transformation in the green development concept of enterprises. Second, increase investment in pollution control projects to incentivize enterprises to engage in green production and innovation. In addition to the subsidies provided by the central government, the government should improve the financing mechanism for the green transformation of enterprises and provide green credit support to alleviate the cost and the financing constraints of enterprises; furthermore, the government should promote the diversification of the investors and accelerate the marketization of environmental operation and management by establishing a sound market mechanism for environmental investment. Third, regulate the intensity of environmental regulation enforcement. For example, adjust the structure of the environmental tax system, improve the design of tax rates and taxation management, and enhance the elasticity of the tax system to avoid the adverse effects of excessive tax intervention.</p>
</list-item>
<list-item>
<p>(2) Enterprises should fully apply the green innovation performance prediction model to adjust their development plan and resource bases to achieve their innovation goals. First, they should improve the internal driving mechanism of green innovation and focus on financial and operational optimization. Specifically, they should accumulate redundant resources and enhance their operating capacity to fulfill their environmental responsibilities better and to build green competitive advantages; besides, they should comply with the environmental regulatory constraints, enhance their social responsibility image, and improve investor evaluation to obtain more green innovation resources and preferential policies. Second, improve the transparency of corporate information by applying prediction models of corporate green innovation performance, effectively alleviating the degree of information asymmetry of internal and external investors and invigorating their enthusiasm for environmental protection investment. Third, industries or enterprises with outstanding green innovation capabilities should fully play their demonstration and leading roles in energy, emission reduction, and green transformation to motivate the green development momentum of the whole market and achieve high-quality social and economic development.</p>
</list-item>
</list>
</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author contributions</title>
<p>JZ contributes to manuscript writing, data collection and compiling, experimental design, data analysis, and the revision. KY contributes to the paper&#x2019;s idea, including the theme and method, the framework construction, and the revision. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="correction-note" id="s9">
<title>Correction note</title>
<p>This article has been corrected with minor changes. These changes do not impact the scientific content of the article.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="s11">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fenvs.2023.1252271/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fenvs.2023.1252271/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Table1.XLS" id="SM1" mimetype="application/XLS" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Beaver</surname>
<given-names>W. H.</given-names>
</name>
</person-group> (<year>1966</year>). <article-title>Financial ratios as predictors of failure</article-title>. <source>J. J. Account. Res.</source> <volume>4</volume>, <fpage>71</fpage>&#x2013;<lpage>111</lpage>. <pub-id pub-id-type="doi">10.2307/2490171</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Berrone</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Fosfuri</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Gelabert</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gomez-Mejia</surname>
<given-names>L. R.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Necessity as the mother of green inventions: institutional pressures and environmental innovations</article-title>. <source>J. Strategic Manag. J.</source> <volume>34</volume> (<issue>8</issue>), <fpage>891</fpage>&#x2013;<lpage>909</lpage>. <pub-id pub-id-type="doi">10.1002/smj.2041</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bo</surname>
<given-names>W. G.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J. F.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Local government competition and environmental regulation heterogeneity: race to the bottom or race to the top?</article-title> <source>J. Soft Sci.</source> <volume>11</volume>, <fpage>76</fpage>&#x2013;<lpage>93</lpage>. <pub-id pub-id-type="doi">10.3969/j.issn.1002-9753.2018.11.009</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Breiman</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>J. H.</given-names>
</name>
<name>
<surname>Olshen</surname>
<given-names>R. A.</given-names>
</name>
<name>
<surname>Stone</surname>
<given-names>C. J.</given-names>
</name>
</person-group> (<year>1984</year>). <article-title>Classification and regression trees (CART)</article-title>. <source>J. Biom.</source> <volume>40</volume> (<issue>3</issue>), <fpage>358</fpage>. <pub-id pub-id-type="doi">10.2307/2530946</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Campbell</surname>
<given-names>J. L.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Why would corporations behave in socially responsible ways? An institutional theory of corporate social responsibility</article-title>. <source>J. Acad. Manag. Rev.</source> <volume>32</volume> (<issue>3</issue>), <fpage>946</fpage>&#x2013;<lpage>967</lpage>. <pub-id pub-id-type="doi">10.5465/amr.2007.25275684</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Y. T.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Analytical comparison of random forest and gradient boosting decision trees for integrated learning algorithms</article-title>. <source>J. Comput. Knowl. Technol.</source> <volume>17</volume> (<issue>15</issue>), <fpage>32</fpage>&#x2013;<lpage>34</lpage>. <pub-id pub-id-type="doi">10.14004/j.cnki.ckt.2021.1441</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cheng</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Ioannou</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Serafeim</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Corporate social responsibility and access to finance</article-title>. <source>J. Strategic Manag. J.</source> <volume>35</volume> (<issue>1</issue>), <fpage>1</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1002/smj.2131</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chien</surname>
<given-names>S. C.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>T. Y.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>S. L.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Application of neuro-fuzzy networks to forecast innovation performance - the example of Taiwanese manufacturing industry</article-title>. <source>J. Expert Syst. Appl.</source> <volume>37</volume> (<issue>2</issue>), <fpage>1086</fpage>&#x2013;<lpage>1095</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2009.06.107</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cho</surname>
<given-names>C. H.</given-names>
</name>
<name>
<surname>Jung</surname>
<given-names>J. H.</given-names>
</name>
<name>
<surname>Kwak</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yoo</surname>
<given-names>C. Y.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Professors on the board: do they contribute to society outside the classroom?</article-title> <source>J. J. Bus. Ethics</source> <volume>141</volume> (<issue>2</issue>), <fpage>393</fpage>&#x2013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.1007/s10551-015-2718-x</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>De la Paz-Marin</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Campoy-Munoz</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Hervas-Martinez</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Non-linear multiclassifier model based on Artificial Intelligence to predict research and development performance in European Countries</article-title>. <source>J. Technol. Forecast Soc. Change</source> <volume>79</volume>, <fpage>1731</fpage>&#x2013;<lpage>1745</lpage>. <pub-id pub-id-type="doi">10.1016/j.techfore.2012.06.001</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deng</surname>
<given-names>Y. P.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>W. J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Does environmental regulation promote green innovation capability? &#x2013;Evidence from China</article-title>. <source>J. Stat. Res.</source> <volume>38</volume> (<issue>7</issue>), <fpage>76</fpage>&#x2013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.19343/j.cnki.11-1302/c.2021.07.006</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Duan</surname>
<given-names>Y. Q.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>S. L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Command-based environmental regulation and heavy polluters&#x2019; investment: inventive or disincentive?</article-title> <source>J. J. Financial Dev. Res.</source> (<issue>07</issue>), <fpage>54</fpage>&#x2013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.19647/j.cnki.37-1462/f.2021.07.008</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>J. Z.</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Empirical analysis on financial ability of the listed companies in Shanxi Province</article-title>. <source>J. J. Shanxi Univ. (Philosophy Soc. Sci.)</source> <volume>30</volume> (<issue>4</issue>), <fpage>68</fpage>&#x2013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.13451/j.cnki.shanxi.univ(phil.soc.).2007.04.014</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedman</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Greedy function approximation: a gradient boosting machine</article-title>. <source>J. Ann. Statistics</source> <volume>29</volume>, <fpage>1189</fpage>&#x2013;<lpage>1232</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1013203451</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guo</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>The effects of environmental regulation on green technology innovation-Evidence of Porter effect in China</article-title>. <source>J. Finance Trade Econ.</source> <volume>40</volume> (<issue>3</issue>), <fpage>147</fpage>&#x2013;<lpage>160</lpage>.</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hart</surname>
<given-names>S. L.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>A natural-resource-based view of the firm</article-title>. <source>J. Acad. Manag. Rev.</source> <volume>20</volume> (<issue>4</issue>), <fpage>986</fpage>&#x2013;<lpage>1014</lpage>. <pub-id pub-id-type="doi">10.2307/258963</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hajek</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Henriques</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Castelli</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Vanneschi</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Forecasting performance of regional innovation systems using semantic-based genetic programming with local search optimizer</article-title>. <source>J. Comput. Operations Res.</source> <volume>106</volume> (<issue>6</issue>), <fpage>179</fpage>&#x2013;<lpage>190</lpage>. <pub-id pub-id-type="doi">10.1016/j.cor.2018.02.001</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hajek</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Henriques</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Modelling innovation performance of European regions using multi-output neural networks</article-title>. <source>J. PLoS One</source> <volume>12</volume>, <fpage>e0185755</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0185755</pub-id>
<pub-id pub-id-type="pmid">28968449</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hao</surname>
<given-names>J. L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Research on the moderating role of financial performance on enterprises&#x27; R&#x26;D investment and innovation performance--taking high-tech enterprises as an example</article-title>. <source>J. Commer. Account.</source> <volume>21</volume>, <fpage>43</fpage>&#x2013;<lpage>46</lpage>.</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hasan</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Zgair</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Ngotoye</surname>
<given-names>A. A.</given-names>
</name>
<name>
<surname>Hussain</surname>
<given-names>H. N.</given-names>
</name>
<name>
<surname>Najmuldeen</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A review of the factors that influence the adoption of cloud computing by small and medium enterprises</article-title>. <source>J. Scholars J. Econ. Bus. Manag.</source> <volume>2</volume>, <fpage>842</fpage>&#x2013;<lpage>848</lpage>.</citation>
</ref>
<ref id="B21">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hastie</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Tibshirani</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>J. H.</given-names>
</name>
</person-group> (<year>2009</year>). <source>The elements of statistical learning: data mining, inference, and prediction. M</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Springer</publisher-name>.</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ho</surname>
<given-names>Y. C.</given-names>
</name>
<name>
<surname>Tsai</surname>
<given-names>C. T.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Comparing ANFIS and SEM in linear and nonlinear forecasting of new product development performance</article-title>. <source>J. Expert Syst. Appl.</source> <volume>38</volume> (<issue>6</issue>), <fpage>6498</fpage>&#x2013;<lpage>6507</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2010.11.095</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hojnik</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Ruzzier</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>What drives eco-innovation? A review of emerging literature</article-title>. <source>J. Environ. Innovation Soc. Transitions</source> <volume>19</volume>, <fpage>31</fpage>&#x2013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1016/j.eist.2015.09.006</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jerome</surname>
<given-names>H. F.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Greedy function approximation: a gradient boosting machine</article-title>. <source>J. Ann. Statistics</source> <volume>11</volume> (<issue>10</issue>), <fpage>877</fpage>&#x2013;<lpage>884</lpage>.</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kotsiantis</surname>
<given-names>S. B.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Decision trees: a recent overview</article-title>. <source>J. Artif. Intell. Rev.</source> <volume>39</volume> (<issue>4</issue>), <fpage>261</fpage>&#x2013;<lpage>283</lpage>. <pub-id pub-id-type="doi">10.1007/s10462-011-9272-4</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>D. Y.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ren</surname>
<given-names>S. G.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X. H.</given-names>
</name>
<name>
<surname>Ning</surname>
<given-names>L. T.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Environmental legitimacy, green innovation, and corporate carbon disclosure: evidence from CDP China 100</article-title>. <source>J. J. Bus. Ethics</source> <volume>150</volume> (<issue>4</issue>), <fpage>1089</fpage>&#x2013;<lpage>1104</lpage>. <pub-id pub-id-type="doi">10.1007/s10551-016-3187-6</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Q. Y.</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>Z. H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Heterogeneous environmental regulation tools and green innovation incentives: evidence from green patents of listed companies</article-title>. <source>J. Econ. Res. J.</source> (<issue>9</issue>), <fpage>192</fpage>&#x2013;<lpage>208</lpage>.</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>W. A.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y. W.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>M. N.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X. L.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>G. Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Research on green governance of Chinese listed companies and its evaluation</article-title>. <source>J. Manag. World</source> (<issue>05</issue>), <fpage>126</fpage>&#x2013;<lpage>133&#x2b;160</lpage>. <pub-id pub-id-type="doi">10.19744/j.cnki.11-1235/f.2019.0070</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z. K.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H. Y.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The influence of top managers&#x2019; military experience on enterprise green innovation</article-title>. <source>J. Soft Sci.</source> <volume>12</volume>, <fpage>74</fpage>&#x2013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.13956/j.ss.1001-8409.2021.12.12</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Corporate financial capability evaluation and path selection: empirical evidence from listed pharmaceutical companies</article-title>. <source>Commun. Finance Account.</source> (<issue>4</issue>), <fpage>41</fpage>&#x2013;<lpage>44</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>J. C.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>G. S.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Can CEO green experience promote the green innovation?</article-title> <source>J. Bus. Manag. J.</source> (<issue>2</issue>), <fpage>106</fpage>&#x2013;<lpage>121</lpage>. <pub-id pub-id-type="doi">10.19616/j.cnki.bmj.2022.02.007</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Chau</surname>
<given-names>K. W.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A decade&#x2019;s debate on the nexus between corporate social and corporate financial performance: a critical review of empirical studies 2002-2011</article-title>. <source>J. J. Clean. Prod.</source> <volume>79</volume>, <fpage>195</fpage>&#x2013;<lpage>206</lpage>. <pub-id pub-id-type="doi">10.1016/j.jclepro.2014.04.072</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Manso</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Motivating innovation</article-title>. <source>J. J. Finance</source> <volume>66</volume> (<issue>5</issue>), <fpage>1823</fpage>&#x2013;<lpage>1860</lpage>. <pub-id pub-id-type="doi">10.1111/j.1540-6261.2011.01688.x</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Meng</surname>
<given-names>L. P.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>C. F.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>An empirical study on intertemporal interaction between corporate environmental responsibility and corporate financial performance</article-title>. <source>J. J. Tongji Univ. Soc. Sci. Ed.</source> <volume>34</volume> (<issue>2</issue>), <fpage>107</fpage>&#x2013;<lpage>117</lpage>.</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Montmartin</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Herrera</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Internal and external effects of R&#x26;D subsidies and fiscal incentives: empirical Evidence using spatial dynamic panel models</article-title>. <source>J. Res. Policy</source> <volume>44</volume> (<issue>5</issue>), <fpage>1065</fpage>&#x2013;<lpage>1079</lpage>. <pub-id pub-id-type="doi">10.1016/j.respol.2014.11.013</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Petroni</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Bigliardi</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Galati</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Rethinking the Porter Hypothesis: the underappreciated importance of value appropriation and pollution intensity</article-title>. <source>J. Rev. Policy Res.</source> <volume>36</volume> (<issue>1</issue>), <fpage>121</fpage>&#x2013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1111/ropr.12317</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Porter</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Van der Linde</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Toward a new conception of the environment-competitiveness relationship</article-title>. <source>J. J. Econ. Perspect.</source> <volume>9</volume> (<issue>4</issue>), <fpage>97</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1257/jep.9.4.97</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Preston</surname>
<given-names>L. E.</given-names>
</name>
<name>
<surname>Obannao</surname>
<given-names>D. P.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>The corporate social-financial performance relationship: a Typology and Analysis</article-title>. <source>J. Bus. Soc.</source> <volume>36</volume> (<issue>4</issue>), <fpage>419</fpage>&#x2013;<lpage>429</lpage>. <pub-id pub-id-type="doi">10.1177/000765039703600406</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qi</surname>
<given-names>S. Z.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>J. B.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Do environmental rights trading schemes induce green innovation? Evidence from listed firms in China</article-title>. <source>J. Econ. Res. J.</source> <volume>12</volume>, <fpage>129</fpage>&#x2013;<lpage>143</lpage>.</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ross</surname>
<given-names>S. A.</given-names>
</name>
</person-group> (<year>1977</year>). <article-title>The determination of financial structure: the incentive-signalling approach</article-title>. <source>J. Bell J. Econ.</source> <volume>8</volume> (<issue>1</issue>), <fpage>23</fpage>&#x2013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.2307/3003485</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Samara</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Georgiadis</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bakouros</surname>
<given-names>I.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>The impact of innovation policies on the performance of national innovation systems: a system dynamics analysis</article-title>. <source>J. Technovation</source> <volume>32</volume>, <fpage>624</fpage>&#x2013;<lpage>638</lpage>. <pub-id pub-id-type="doi">10.1016/j.technovation.2012.06.002</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sheng</surname>
<given-names>Y. M.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y. Y.</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>Y. L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>An empirical study on the interaction between environmental responsibility and corporate financial performance of listed companies</article-title>. <source>J. Statistics Decis.</source> <volume>35</volume> (<issue>19</issue>), <fpage>172</fpage>&#x2013;<lpage>176</lpage>. <pub-id pub-id-type="doi">10.13546/j.cnki.tjyjc.2019.19.039</pub-id>
</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>G. W.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Differences in environmental regulation and green innovation of enterprises driven by &#x201c;Two wheels&#x201d;&#x2014;based on signal transmission theory</article-title>. <source>J. Soft Sci</source>.</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tibshirani</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1996</year>). <article-title>Regression shrinkage and selection via the Lasso</article-title>. <source>J. J. R. Stat. Soc. Ser. B(Methodological)</source> <volume>73</volume>, <fpage>267</fpage>&#x2013;<lpage>288</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1996.tb02080.x</pub-id>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Triguero</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Moreno</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Davia</surname>
<given-names>M. A.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Drivers of different types of eco-innovation in European SMEs</article-title>. <source>J. Log. Econ.</source> <volume>92</volume>, <fpage>25</fpage>&#x2013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1016/j.ecolecon.2013.04.009</pub-id>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Waddock</surname>
<given-names>S. A.</given-names>
</name>
<name>
<surname>Graves</surname>
<given-names>S. B.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>The corporate social performance-financial performance link</article-title>. <source>J. Strategic Manag. J.</source> <volume>18</volume> (<issue>4</issue>), <fpage>303</fpage>&#x2013;<lpage>319</lpage>. <pub-id pub-id-type="doi">10.1002/(sici)1097-0266(199704)18:4&#x3c;303::aid-smj869&#x3e;3.0.co;2-g</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>B. B.</given-names>
</name>
<name>
<surname>Qi</surname>
<given-names>S. Z.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>The effect of market-oriented and command-and-control policy tools on emissions reduction innovation&#x2014;an empirical analysis based on China&#x2019;s industrial patent data</article-title>. <source>J. Chin. Ind. Econ.</source> (<issue>6</issue>), <fpage>91</fpage>&#x2013;<lpage>108</lpage>. <pub-id pub-id-type="doi">10.19581/j.cnki.ciejournal.2016.06.008</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J. R.</given-names>
</name>
<name>
<surname>Cui</surname>
<given-names>R. X.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Stakeholder environmental pressure, external knowledge adoption, green innovation&#x2014;moderation effects of market uncertainty and slack resource</article-title>. <source>J. Res. Dev. Manag.</source> <volume>33</volume> (<issue>4</issue>), <fpage>15</fpage>&#x2013;<lpage>27</lpage>.</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>T. Y.</given-names>
</name>
<name>
<surname>Chien</surname>
<given-names>S. C.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Forecasting innovation performance via neural networks&#x2014;a case of Taiwanese manufacturing industry</article-title>. <source>J. Technovation</source> <volume>26</volume>, <fpage>635</fpage>&#x2013;<lpage>643</lpage>. <pub-id pub-id-type="doi">10.1016/j.technovation.2004.11.001</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Research on the green innovation promoted by green credit policies</article-title>. <source>J. J. Manag. World</source> (<issue>06</issue>), <fpage>173</fpage>&#x2013;<lpage>188&#x2b;11</lpage>. <pub-id pub-id-type="doi">10.19744/j.cnki.11-1235/f.2021.0085</pub-id>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>B. C.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>S. K.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Research on the impact of government subsidies on green innovation of enterprises&#x2014;the moderating effect of political connection and environmental regulation</article-title>. <source>J. Sci. Res. Manag</source>.</citation>
</ref>
<ref id="B52">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2020</year>). <source>Research on the effectiveness of environmental management system certification from the perspective of corporate finance and institutional environment [dissertation]</source>. <publisher-loc>[Wuhan, Hubei]</publisher-loc>: <publisher-name>Huazhong University of Science and Technology</publisher-name>.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z. G.</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>L. L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>A study on interactive and intertemporal influence and mechanism of corporate environmental responsibility and financial performance</article-title>. <source>J. Manag. Rev.</source> <volume>32</volume> (<issue>2</issue>), <fpage>76</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.14120/j.cnki.cn11-5057/f.2020.02.007</pub-id>
</citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Zheng</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kong</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Qin</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2019a</year>). <article-title>Local environmental governance pressure, executive&#x2019;s working experience and enterprise investment in environmental protection: a quasi-natural experiment based on China&#x2019;s &#x201c;Ambient air quality standards 2012&#x201d;</article-title>. <source>J. Econ. J. Res.</source> <volume>18</volume> (<issue>6</issue>), <fpage>183</fpage>&#x2013;<lpage>198</lpage>. <pub-id pub-id-type="doi">10.1186/s12944-019-1126-0</pub-id>
</citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>X. W.</given-names>
</name>
</person-group> (<year>2003</year>). <article-title>The construction of a corporate financial capability diagnostic index system</article-title>. <source>J. Finance Account. Mon.</source> (<issue>23</issue>), <fpage>12</fpage>&#x2013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.19641/j.cnki.42-1290/f.2003.23.007</pub-id>
</citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Z. G.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>D. T.</given-names>
</name>
</person-group> (<year>2019b</year>). <article-title>Is environmental management system certification of the enterprise effective?</article-title> <source>J. Nankai Bus. Rev.</source> <volume>22</volume> (<issue>04</issue>), <fpage>123</fpage>&#x2013;<lpage>134</lpage>. <pub-id pub-id-type="doi">10.3969/j.issn.1008-3448.2019.04.012</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>