<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Med.</journal-id>
<journal-title>Frontiers in Medicine</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Med.</abbrev-journal-title>
<issn pub-type="epub">2296-858X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmed.2025.1598065</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Medicine</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Ensemble machine learning for predicting renal function decline in chronic kidney disease: development and external validation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname><given-names>Hong</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/3012437/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Huang</surname><given-names>Yuping</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Chen</surname><given-names>Lizhen</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Nephrology, the 95th Hospital of Putian in China RongTong Medical Health Corporation</institution>, <addr-line>Putian</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Rheumatology and Immunology, the 95th Hospital of Putian in China RongTong Medical Health Corporation</institution>, <addr-line>Putian</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by" id="fn0001"><p>Edited by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/716436/overview">Olaniyi Samuel Iyiola</ext-link>, Morgan State University, United States</p></fn>
<fn fn-type="edited-by" id="fn0002"><p>Reviewed by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1518580/overview">&#x00D6;mer Faruk &#x00C7;i&#x00E7;ek</ext-link>, Selcuk University, T&#x00FC;rkiye</p>
<p>Samit Kumar Ghosh, Khalifa University, United Arab Emirates</p></fn>
<corresp id="c001">&#x002A;Correspondence: Hong Chen, <email>yykw76@163.com</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>27</day>
<month>10</month>
<year>2025</year>
</pub-date>
<pub-date pub-type="collection">
<year>2025</year>
</pub-date>
<volume>12</volume>
<elocation-id>1598065</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>03</month>
<year>2025</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>09</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2025 Chen, Huang and Chen.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Chen, Huang and Chen</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>Chronic kidney disease (CKD) poses a significant global health challenge, requiring timely interventions to manage renal function decline. Traditional predictive models often lack accuracy and generalizability. This study aimed to develop and validate a machine learning model to enhance risk prediction of renal function decline in CKD patients, enabling early and personalized interventions.</p>
</sec>
<sec>
<title>Methods</title>
<p>We developed an ensemble machine learning model using Random Forest, XGBoost, and LightGBM algorithms, incorporating advanced feature selection and hyperparameter tuning. The model was trained and validated on data from 1,200 CKD patients across multiple clinics, selected through stringent inclusion and exclusion criteria. Clinical, demographic, and laboratory data were processed with rigorous quality control. Model performance was assessed using area under the curve (AUC), calibration metrics, and five-fold cross-validation, with external validation across three medical centers.</p>
</sec>
<sec>
<title>Results</title>
<p>The ensemble model achieved an AUC of 0.89 (95% CI: 0.87-0.91), outperforming traditional Cox models (AUC: 0.82, 95% CI: 0.79-0.85) and standard machine learning approaches (AUC: 0.85, 95% CI: 0.83-0.87). Key predictors identified via SHAP analysis included estimated glomerular filtration rate (eGFR), age, and urinary protein-creatinine ratio. The model demonstrated excellent calibration (slope: 0.96, 95% CI: 0.94-0.98) and robust performance across diverse patient subgroups, with a 60.6% reduction in computational resource use compared to traditional methods.</p>
</sec>
<sec>
<title>Discussion</title>
<p>This machine learning model offers a significant advancement in predicting CKD progression, providing a reliable, generalizable tool for early risk stratification. Its superior accuracy and efficiency support integration into clinical workflows, potentially transforming CKD management by enabling proactive, data-driven interventions. Future research should focus on incorporating novel biomarkers and expanding multicenter validation to further enhance clinical applicability.</p>
</sec>
</abstract>
<kwd-group>
<kwd>machine learning</kwd>
<kwd>chronic kidney disease progression</kwd>
<kwd>risk prediction modeling</kwd>
<kwd>clinical decision support</kwd>
<kwd>precision nephrology</kwd>
</kwd-group>
<counts>
<fig-count count="10"/>
<table-count count="11"/>
<equation-count count="11"/>
<ref-count count="45"/>
<page-count count="15"/>
<word-count count="9935"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Nephrology</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="sec1">
<label>1</label>
<title>Introduction</title>
<p>Chronic kidney disease (CKD) has become a particularly urgent health challenge worldwide. In developed countries, the annual medical cost for chronic kidney disease exceeds 120 billion US dollars, and the prevalence of this disease is still on the rise, which has put pressure on the global healthcare system (<xref ref-type="bibr" rid="ref1">1</xref>, <xref ref-type="bibr" rid="ref2">2</xref>). Traditional methods used to predict the progression of chronic kidney disease Relying heavily on some scattered clinical indicators and simple linear models, the accuracy of prediction is relatively poor, with an AUC less than 0.75 (<xref ref-type="bibr" rid="ref3 ref4 ref5">3&#x2013;5</xref>). Moreover, the universality of this prediction method is also relatively limited among different patient groups. Compared with traditional statistical methods (<xref ref-type="bibr" rid="ref6 ref7 ref8">6&#x2013;8</xref>), the accuracy of machine learning in healthcare applications has increased by 15 to 20%. However, the existing chronic kidney disease prediction models have some key problems, such as a lack of interpretability, inability to capture the dynamic changes of the disease over time, and insufficient validation in different clinical Settings (<xref ref-type="bibr" rid="ref9 ref10 ref11">9&#x2013;11</xref>).</p>
<p>In the early research on the risk prediction of chronic kidney disease, researchers used traditional statistical methods to predict CKD. In (<xref ref-type="bibr" rid="ref12">12</xref>), the authors constructed a structural equation model for risk prediction to predict CKD. In (<xref ref-type="bibr" rid="ref13">13</xref>), the author selected factors associated with renal failure from a large number of variables and then established a Cox proportional hazards regression model, using this model to predict and evaluate the risk of renal failure. Although traditional statistical methods can predict the risk of CKD, their accuracy is relatively low. With the continuous development of machine learning, researchers have begun to explore the application of machine learning prediction methods. In (<xref ref-type="bibr" rid="ref14">14</xref>), the authors used machine learning techniques such as random forests and decision trees, effectively improving the performance of prediction. In (<xref ref-type="bibr" rid="ref15">15</xref>), the authors combined five different machine learning methods, such as Naive Bayes and random Forest, with feature selection techniques and ensemble learning, and used SHAP and LIME to demonstrate the visualization of personalized CKD prediction models, thereby enhancing the interpretability of the models. It has provided a brand-new perspective for CKD medical research. In (<xref ref-type="bibr" rid="ref16">16</xref>), the authors trained the medical records of 400 patients using different machine learning methods such as Cat Boost, AdaBoost, and Extra Trees. Finally, the accuracy rate reached 97.5%, which shows that the ensemble learning model has potential in the early diagnosis of CKD. In (<xref ref-type="bibr" rid="ref17">17</xref>), the author proposes an interpretability strategy that uses five machine learning methods to predict CDK datasets and utilizes LIME features to enhance the interpretability of the model. Our code is publicly available at: <ext-link xlink:href="https://gitee.com/forest-AI/CDK-Model" ext-link-type="uri">https://gitee.com/forest-AI/CDK-Model</ext-link>.</p>
<p>This study addresses these fundamental challenges by leveraging four key innovative points, which enable CKD risk prediction to exceed the current capacity. We introduce a brand-new temporal feature engineering framework (<xref ref-type="bibr" rid="ref18">18</xref>), which can systematically capture the short-term changes and long-term development trends of the disease. It has made great progress compared with the static snapshot methods used to describe existing models in the past. The previous static snapshot methods were rather limited. However, this new framework enables the model not only to simply assess the risk status at a certain moment but also to understand the progression pattern of diseases. We have optimized the integration architecture, which has significantly reduced the demand for computing resources by 60.6% while still maintaining a good prediction effect. Nowadays, many complex machine learning systems encounter some practical implementation obstacles in clinical applications. Our optimization directly addresses these issues. We have developed a comprehensive multi-center external validation strategy in three medical centers and conducted detailed analyses of resource utilization and scalability. In theoretical machine learning research, Most of the time, there is a lack of strong evidence regarding real-world deployment, and our study provides such evidence. We designed from a clinical perspective, combining domain knowledge with advanced feature selection methods to create an interpretable decision support framework. There is a critical gap between complex computational methods and actual clinical applications, and this framework fills this gap.</p>
</sec>
<sec sec-type="methods" id="sec2">
<label>2</label>
<title>Methods</title>
<sec id="sec3">
<label>2.1</label>
<title>Study subjects</title>
<p>Choosing the right study population for the creation of a robust machine learning model to predict decline in renal function is quite the painstaking process. To achieve greater generalizability and external validity, our multi-center study devised a selection protocol to create a representative dataset with greater accuracy (<xref ref-type="bibr" rid="ref9">9</xref>, <xref ref-type="bibr" rid="ref10">10</xref>).</p>
<p>Screening of the population&#x2019;s initial sample involved 2,500 potential candidates across five tertiary health care centers referred to in <xref ref-type="fig" rid="fig1">Figure 1</xref>. Inclusion criteria were methodically developed based on clinical guidelines, specifically the Kidney Disease: Improving Global Outcomes (KDIGO) 2012 guidelines. Eligible participants were adults aged 18&#x2013;75&#x202F;years with documented chronic kidney disease (CKD), defined as an estimated glomerular filtration rate (eGFR)&#x202F;&#x003C;&#x202F;60&#x202F;mL/min/1.73m<sup>2</sup> for at least 3&#x202F;months, confirmed by at least two measurements, or the presence of persistent proteinuria (urine protein-to-creatinine ratio [UPCR]&#x202F;&#x2265;&#x202F;0.2&#x202F;g/g for at least 3&#x202F;months) or other markers of kidney damage (e.g., abnormal renal imaging or biopsy findings) as recorded in electronic health records (EHRs) with standardized diagnostic codes (e.g., ICD-10 codes N18.1-N18.5). This operational definition ensures that CKD diagnosis is not solely reliant on eGFR but incorporates additional clinical and laboratory evidence consistent with KDIGO criteria, enhancing diagnostic specificity and reproducibility. Several longitudinal record requirements were established: a minimum of 2&#x202F;years of electronic health records (EHRs) spanning from January 2021 to December 2024, and a minimum of four serum creatinine tests conducted within the 12&#x202F;months prior to the study&#x2019;s end date (December 2024). These requirements ensured robust longitudinal data to capture renal function trends while reflecting contemporary clinical practices and standardized assay technologies during the study period. After applying these criteria, 1,800 participants qualified for further evaluation.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>Study flow diagram.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g001.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart detailing participant selection for a study. Initial screening includes 2,500 individuals. Inclusion criteria: age 18-75, documented CKD, over two years of medical records, complete clinical data, informed consent, totaling 1,800. Exclusion criteria: acute kidney injury, active cancer treatment, severe comorbidities, incomplete or inconsistent records, reducing the sample by 600. Final analyzed sample is 1,200.</alt-text>
</graphic>
</fig>
<p>To protect data quality and minimize potential confounding factors, a detailed exclusion criterion was employed (<xref ref-type="bibr" rid="ref19">19</xref>). Some crucial exclusion criteria included active malignancy or chemotherapy within the previous 3&#x202F;months of the study, kidney transplant, acute renal failure, some other severe comorbid condition that may affect renal function, and poor or incomplete medical file documentation. After applying these criteria, 600 participants were removed as a result of the process.</p>
<p>With the criteria applied, the total number of remaining participants is 1,200, determined through a power analysis (0.05, 0.10) based on expected model performance and complexity. Here, <italic>&#x03B1;</italic>&#x202F;=&#x202F;0.05 represents the significance level (Type I error rate, false positive rate), i.e., the probability threshold for rejecting the null hypothesis. <italic>&#x03B2;</italic>&#x202F;=&#x202F;0.10 represents the Type II error rate (false negative rate), corresponding to a statistical power of 1&#x2013;&#x03B2;&#x202F;=&#x202F;0.90 (90% power), i.e., the probability of correctly detecting a true effect (<xref ref-type="bibr" rid="ref20">20</xref>). This comprehensive and systematic selection, along with stringent inclusion and exclusion criteria, enhances the quality of the constructed dataset, making it highly suitable for developing and validating complex predictive models for kidney function decline. Only participants clinically and demographically suitable for the model can be considered the &#x201C;target population,&#x201D; thereby increasing its utility in clinical practice.</p>
</sec>
<sec id="sec4">
<label>2.2</label>
<title>Data collection and processing</title>
<p>A thorough strategy of data collection and processing was constructed in order to provide clean and usable data for the machine learning model. The quality control system comprised three sequential phases: data collection, preprocessing, feature engineering, with quality checks integrated at each stage (<xref ref-type="fig" rid="fig2">Figure 2</xref>).</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption>
<p>Data processing pipeline.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g002.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart depicting the data processing workflow. It begins with data collection, including clinical features and laboratory tests, followed by data preprocessing involving missing value handling and normalization. Feature engineering comprises feature selection and extraction. Detailed processing steps include raw data cleaning, processing, feature selection, and culminate in the final dataset.</alt-text>
</graphic>
</fig>
<p>The data collection protocol was meticulously designed to ensure consistency and accuracy across all participating centers. The study employed a retrospective data collection approach, leveraging electronic health records (EHRs) from five tertiary healthcare centers. The data spanned a period from January 2021 to December 2024, capturing comprehensive clinical and laboratory parameters relevant to kidney function assessment and laboratory parameters which were defined according to existing protocols for assessing kidney function (<xref ref-type="bibr" rid="ref21">21</xref>, <xref ref-type="bibr" rid="ref22">22</xref>). The clinical data consisted of demographic features, comorbid conditions, medication usage, and other clinically relevant observations, and were extracted through standardized electronic health record protocols. Laboratory measurements included comprehensive metabolic panels, complete blood counts, and specific renal function measurements such as serum creatinine, eGFR, and urine protein to creatinine ratio. Additional biochemical parameters such as hemoglobin, albumin, and electrolytes were collected to capture the multifaceted nature of kidney disease progression.</p>
<p>Data preprocessing was executed within a strict quality assurance framework, as described in (<xref ref-type="bibr" rid="ref23">23</xref>). Participant flow was meticulously tracked, with 2,500 potential candidates initially screened across five tertiary healthcare centers, resulting in 1,800 eligible participants after applying inclusion criteria and 1,200 final participants after exclusion criteria were enforced (see <xref ref-type="fig" rid="fig1">Figure 1</xref> for the study flow diagram). Missing data patterns were analyzed, revealing that missingness was primarily missing at random (MAR), with serum creatinine and urine protein-to-creatinine ratio (UPCR) missing in approximately 8 and 12% of cases, respectively, due to variations in clinical testing frequency. Advanced imputation methods were employed: multiple imputation by chained equations (MICE) for continuous data and mode imputation for categorical data, with validation tests confirming imputation precision (mean absolute error &#x003C;5% for continuous variables). Continuous data was normalized with z-scores to ensure comparability, and categorical data was encoded using preservation-optimized schemes, such as one-hot encoding for nominal variables. Outlier detection and validation were performed through a combination of statistical techniques (e.g., interquartile range method) and clinical judgment to ensure clinical plausibility. Outcome assessment was conducted using a blinded approach, where evaluators determining renal function decline (defined as eGFR decline &#x2265;30% or progression to dialysis) were unaware of the model&#x2019;s predictions to minimize bias. Predictors, including eGFR, age, UPCR, comorbidities, and serum creatinine, were pre-specified based on clinical guidelines (KDIGO 2012) and prior literature (<xref ref-type="bibr" rid="ref1">1</xref>, <xref ref-type="bibr" rid="ref13">13</xref>), ensuring alignment with established nephrology knowledge. This comprehensive preprocessing strategy, coupled with rigorous quality checks, ensured data integrity and supported robust model development.</p>
<p>Feature engineering integrates knowledge from related fields to enhance the performance of prediction. As stated in references (<xref ref-type="bibr" rid="ref24">24</xref>, <xref ref-type="bibr" rid="ref25">25</xref>), we have constructed several temporal features, such as the time intervals between consecutive serum creatinine tests, represented by &#x0394;t, the changes in serum creatinine, represented by &#x0394;SCr, and the rate of change of eGFR over time. Represented by &#x0394;eGFR/&#x0394;t, these features are relied upon to capture the dynamic changes of renal function. In addition, we have also created interaction features, such as the interaction term between age and eGFR, represented as age &#x00D7; eGFR, and the interaction between urine protein-creatinine ratio, that is, UPCR and diabetes status, represented as UPCR &#x00D7; diabetes. These are relied upon to assess how different factors interact with each other. After introducing these features, the ability of the model to predict renal function deterioration has been significantly improved. We also noticed that if the proportion of missing data input is relatively high, it may cause bias. We calculated the False Negative Rate, that is, the False negative Rate, and used this to evaluate the performance of the model in high-risk patients. And optimize the model to reduce the possibility of false negatives. This integrated approach has enhanced the prediction accuracy of the model, made it more reliable, and made it more practical in clinical applications.</p>
<p>The quality control policy is in line with the modern standards of machine learning, as cited in (<xref ref-type="bibr" rid="ref26">26</xref>). For instance, it will check the quality of the data, record every change made to the data, and also record all corresponding databases, as well as apply automatic verification programs. This approach can ensure the replicability of the research. It also provides a foundation for the sustainable improvement of the model and lays the groundwork for future machine learning analysis. To ensure the consistency and coherence of the research, all participating centers followed the standardized data collection protocols developed in accordance with the existing renal function assessment guidelines. More importantly, all centers used the same institutional review board application, which could guarantee consistent adherence to ethical standards at all locations. This method can ensure that the collected data is comprehensive and the data among various centers are comparable, which provides a solid foundation for the development and verification of the prediction model.</p>
</sec>
<sec id="sec5">
<label>2.3</label>
<title>Machine learning model construction</title>
<p>The workflow for building the machine learning model was carefully crafted to combine multiple prediction methods with a specific selection of features and parameters. Our method had three components: ensemble model structure, feature selection pipeline, and training process optimization, as shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>. This approach aligns with recent advancements in clinical ML frameworks, such as the user-friendly ML pipeline proposed by Orhan et al. (<xref ref-type="bibr" rid="ref27">27</xref>) for cardiac structure assessment, which emphasizes interpretability and clinical applicability. Similarly, our ensemble framework prioritizes interpretable decision-making to facilitate integration into clinical workflows for chronic kidney disease (CKD) management. With respect to feature selection, we followed the method set forth by Su et al. (<xref ref-type="bibr" rid="ref26">26</xref>), which used a hybrid model that incorporated both statistical significance and domain knowledge.</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption>
<p>Model architecture. <bold>(A)</bold> Ensemble model structure illustrating the integration of Random Forest, XGBoost, and LightGBM. <bold>(B)</bold> Feature selection pipeline combining filter and wrapper methods with clinical domain knowledge. <bold>(C)</bold> Training process optimization, including hyperparameter tuning and Bayesian optimization framework.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g003.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Diagram illustrating a machine learning model architecture and pipeline. Section A (Model Architecture) includes Random Forest, XGBoost, and LightGBM models combined into a weighted ensemble for risk prediction. Section B (Feature Selection Pipeline) starts with 120 initial features, reduced through statistical selection and clinical validation to 45 features. Section C (Training Process) details a data split (70% train, 15% validation, 15% test), cross-validation with stratified sampling, hyperparameter optimization using Bayesian methods, and model evaluation with AUC-ROC and PR-Curve.</alt-text>
</graphic>
</fig>
<p>The three base learners utilized by the foundational ensemble architecture were: Random Forest, XGBoost, and Light GBM. The primary goal hyperbolic function related to model optimization can be formulated as shown in <xref ref-type="disp-formula" rid="EQ1">Equation (1)</xref>:</p><disp-formula id="EQ1">
<label>(1)</label>
<mml:math id="M1">
<mml:mi>L</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mo stretchy="true">[</mml:mo>
<mml:mi>&#x03B1;</mml:mi>
<mml:mo>&#x00B7;</mml:mo>
<mml:mi mathvariant="italic">BCE</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="true">&#x0302;</mml:mo>
</mml:mover>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>&#x03B2;</mml:mi>
<mml:mo>&#x00B7;</mml:mo>
<mml:mi mathvariant="italic">FL</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="true">&#x0302;</mml:mo>
</mml:mover>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo stretchy="true">]</mml:mo>
<mml:mo>+</mml:mo>
<mml:mi>&#x03BB;</mml:mi>
<mml:mo>&#x2225;</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:msub>
<mml:mo>&#x2225;</mml:mo>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:math>
</disp-formula><p>Within the formulation, BCE refers to the binary cross-entropy loss, <inline-formula>
<mml:math id="M2">
<mml:mi mathvariant="italic">FL</mml:mi>
</mml:math>
</inline-formula> describes the focal loss part, and &#x2225;<italic>&#x03B8;</italic>&#x2225;<sub>2</sub> is the <italic>L<sub>2</sub></italic> regularization term. As in the case with Bellocchi et al. (<xref ref-type="bibr" rid="ref28">28</xref>), cross-validation was used to optimize the hyperparameters <inline-formula>
<mml:math id="M3">
<mml:mi>&#x03B1;</mml:mi>
</mml:math>
</inline-formula>, <inline-formula>
<mml:math id="M4">
<mml:mi>&#x03B2;</mml:mi>
</mml:math>
</inline-formula> and <inline-formula>
<mml:math id="M5">
<mml:mi>&#x03BB;</mml:mi>
</mml:math>
</inline-formula>.The procedure for feature selection was done using both filter and wrapper methods, where the score of importance was computed as shown in <xref ref-type="disp-formula" rid="EQ2">Equation (2)</xref>:</p><disp-formula id="EQ2">
<label>(2)</label>
<mml:math id="M6">
<mml:msub>
<mml:mtext mathvariant="italic">IS</mml:mtext>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mi>&#x03B3;</mml:mi>
<mml:mo>&#x00B7;</mml:mo>
<mml:mi mathvariant="italic">MI</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>Y</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x03B3;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>&#x00B7;</mml:mo>
<mml:msub>
<mml:mtext mathvariant="italic">SHAP</mml:mtext>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:math>
</disp-formula><p>In the equation, MI (<inline-formula>
<mml:math id="M7">
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:math>
</inline-formula>,<inline-formula>
<mml:math id="M8">
<mml:mi>Y</mml:mi>
</mml:math>
</inline-formula>) is the mutual information of the feature <inline-formula>
<mml:math id="M9">
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:math>
</inline-formula> with regard to the target <inline-formula>
<mml:math id="M10">
<mml:mi>Y</mml:mi>
</mml:math>
</inline-formula>, while <inline-formula>
<mml:math id="M11">
<mml:msub>
<mml:mtext mathvariant="italic">SHAP</mml:mtext>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:math>
</inline-formula> is the SHAP value contribution of the <inline-formula>
<mml:math id="M12">
<mml:mi mathvariant="italic">jth</mml:mi>
</mml:math>
</inline-formula> feature. Following Zacharias et al. (<xref ref-type="bibr" rid="ref25">25</xref>), this feature selection process was iteratively modified guided by clinical domain knowledge.</p>
<p>The hyperparameters for each model were systematically optimized as shown in <xref ref-type="table" rid="tab1">Table 1</xref>.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Hyperparameters of different machine learning models.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Model type</th>
<th align="left" valign="top">Parameters</th>
<th align="center" valign="top">Search range</th>
<th align="center" valign="top">Optimal value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top" rowspan="3">Random Forest</td>
<td align="left" valign="top"><inline-formula>
<mml:math id="M13">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula>_estimators</td>
<td align="center" valign="top">[100, 500]</td>
<td align="center" valign="top">300</td>
</tr>
<tr>
<td align="left" valign="top"><inline-formula>
<mml:math id="M14">
<mml:mo>max</mml:mo>
</mml:math>
</inline-formula>_depth</td>
<td align="center" valign="top">[3, 10]</td>
<td align="center" valign="top">6</td>
</tr>
<tr>
<td align="left" valign="top"><inline-formula>
<mml:math id="M15">
<mml:mo>min</mml:mo>
</mml:math>
</inline-formula>_samples_split</td>
<td align="center" valign="top">[2, 10]</td>
<td align="center" valign="top">5</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="3">XGBoost</td>
<td align="left" valign="top">learning_rate</td>
<td align="center" valign="top">[0.01, 0.1]</td>
<td align="center" valign="top">0.05</td>
</tr>
<tr>
<td align="left" valign="top"><inline-formula>
<mml:math id="M16">
<mml:mo>max</mml:mo>
</mml:math>
</inline-formula>_depth</td>
<td align="center" valign="top">[3, 8]</td>
<td align="center" valign="top">5</td>
</tr>
<tr>
<td align="left" valign="top">subsample</td>
<td align="center" valign="top">[0.6, 1.0]</td>
<td align="center" valign="top">0.8</td>
</tr>
<tr>
<td align="left" valign="top" rowspan="3">LightGBM</td>
<td align="left" valign="top">num_leaves</td>
<td align="center" valign="top">[20, 100]</td>
<td align="center" valign="top">50</td>
</tr>
<tr>
<td align="left" valign="top">feature_fraction</td>
<td align="center" valign="top">[0.6, 0.9]</td>
<td align="center" valign="top">0.7</td>
</tr>
<tr>
<td align="left" valign="top">bagging_fraction</td>
<td align="center" valign="top">[0.6, 0.9]</td>
<td align="center" valign="top">0.8</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>This was completed as part of a guided capture-the-flag competition, which uses Time 4 Learning&#x2019;s training resources to prepare. Each model is updated using the same training schema as Ferguson et al. (<xref ref-type="bibr" rid="ref23">23</xref>) predictions and includes a stratified 5-fold cross-validation scheme. The model ensemble prediction was made based on the weighted average method as shown in <xref ref-type="disp-formula" rid="EQ3">Equation (3)</xref>:</p><disp-formula id="EQ3">
<label>(3)</label>
<mml:math id="M17">
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo stretchy="true">&#x0302;</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>K</mml:mi>
</mml:munderover>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x00B7;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</disp-formula><p>where <inline-formula>
<mml:math id="M18">
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mspace width="0.25em"/>
</mml:math>
</inline-formula>represents the prediction from the <inline-formula>
<mml:math id="M19">
<mml:mi>k</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="italic">th</mml:mi>
</mml:math>
</inline-formula> base model and <inline-formula>
<mml:math id="M20">
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:math>
</inline-formula> are the optimized model weights determined through validation performance. The hyperparameter optimization process utilized a Bayesian optimization framework with the expected improvement acquisition function as shown in <xref ref-type="disp-formula" rid="EQ4">Equation (4)</xref>:</p><disp-formula id="EQ4">
<label>(4)</label>
<mml:math id="M21">
<mml:mi mathvariant="italic">EI</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="double-struck">E</mml:mi>
<mml:mo stretchy="true">[</mml:mo>
<mml:mo>max</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
</mml:msup>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>,</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo stretchy="true">]</mml:mo>
</mml:math>
</disp-formula><p>where <inline-formula>
<mml:math id="M22">
<mml:mi>f</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mo>+</mml:mo>
</mml:msup>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</inline-formula> represents the current best observed performance. This approach, validated by Miller et al. (<xref ref-type="bibr" rid="ref29">29</xref>), enabled efficient exploration of the hyperparameter space while balancing exploration and exploitation.</p>
</sec>
<sec id="sec6">
<label>2.4</label>
<title>Model validation method</title>
<p>The model validation process was methodically crafted to enable effective performance evaluation and clinical relevance. Based on the model defined by Churpek et al. (<xref ref-type="bibr" rid="ref30">30</xref>), we devised a systematic validation plan that utilized both internal and external validation.</p>
<p>To confirm the internal validity, we used a stringent 5-fold cross-validation method. The performance metrics were computed as shown in the following formulations (<xref ref-type="bibr" rid="ref31">31</xref>) as shown in <xref ref-type="disp-formula" rid="EQ5">Equation (5)</xref>:</p><disp-formula id="EQ5">
<label>(5)</label>
<mml:math id="M23">
<mml:mtable equalrows="true" equalcolumns="true" displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">AUC</mml:mi>
<mml:mo>=</mml:mo>
<mml:msubsup>
<mml:mo>&#x222B;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mn>1</mml:mn>
</mml:msubsup>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mi>P</mml:mi>
</mml:mfrac>
<mml:mo stretchy="true">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">FP</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:mo stretchy="true">)</mml:mo>
<mml:mi mathvariant="italic">d&#x03B8;</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula><p>where<inline-formula>
<mml:math id="M24">
<mml:mspace width="0.25em"/>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</inline-formula> and <inline-formula>
<mml:math id="M25">
<mml:mi mathvariant="italic">FP</mml:mi>
<mml:mo stretchy="true">(</mml:mo>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</inline-formula> represent the true positive and false positive rates at threshold <inline-formula>
<mml:math id="M26">
<mml:mi>&#x03B8;</mml:mi>
</mml:math>
</inline-formula>, respectively. The calibration assessment utilized the Brier score as shown in <xref ref-type="disp-formula" rid="EQ6">Equation (6)</xref>:</p><disp-formula id="EQ6">
<label>(6)</label>
<mml:math id="M27">
<mml:mtable equalrows="true" equalcolumns="true" displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">BS</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>p</mml:mi>
<mml:mo stretchy="true">&#x0302;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="true">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula><p>where <inline-formula>
<mml:math id="M28">
<mml:mover accent="true">
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="true">&#x0302;</mml:mo>
</mml:mover>
</mml:math>
</inline-formula> represents the predicted probability for the <inline-formula>
<mml:math id="M29">
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi mathvariant="italic">th</mml:mi>
</mml:math>
</inline-formula> instance. The model&#x2019;s discrimination ability was evaluated using multiple metrics as shown in <xref ref-type="table" rid="tab2">Table 2</xref>.</p>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption>
<p>Performance metrics of the ensemble model in internal validation cohort.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Metric</th>
<th align="left" valign="top">Formula</th>
<th align="center" valign="top">Value (95% CI)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Sensitivity</td>
<td align="left" valign="top">
<inline-formula>
<mml:math id="M30">
<mml:mfrac>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">FN</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>
</td>
<td align="center" valign="top">86% (83&#x2013;89%)</td>
</tr>
<tr>
<td align="left" valign="top">Specificity</td>
<td align="left" valign="top">
<inline-formula>
<mml:math id="M31">
<mml:mfrac>
<mml:mi mathvariant="italic">TN</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">TN</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">FP</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>
</td>
<td align="center" valign="top">82% (79&#x2013;85%)</td>
</tr>
<tr>
<td align="left" valign="top">PPV</td>
<td align="left" valign="top">
<inline-formula>
<mml:math id="M32">
<mml:mfrac>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">TP</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">FP</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>
</td>
<td align="center" valign="top">84% (81&#x2013;87%)</td>
</tr>
<tr>
<td align="left" valign="top">NPV</td>
<td align="left" valign="top">
<inline-formula>
<mml:math id="M33">
<mml:mfrac>
<mml:mi mathvariant="italic">TN</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">TN</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">FN</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:math>
</inline-formula>
</td>
<td align="center" valign="top">85% (82&#x2013;88%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>External validation was conducted following the protocol described by Makino et al. (<xref ref-type="bibr" rid="ref32">32</xref>), utilizing an independent cohort from three external medical centers. The concordance between predicted and observed risks was assessed using the calibration slope (<inline-formula>
<mml:math id="M34">
<mml:mi>&#x03B2;</mml:mi>
</mml:math>
</inline-formula>) as shown in <xref ref-type="disp-formula" rid="EQ7">Equation (7)</xref>:</p><disp-formula id="EQ7">
<label>(7)</label>
<mml:math id="M35">
<mml:mtext mathvariant="italic">logit</mml:mtext>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mtext mathvariant="italic">observed</mml:mtext>
</mml:msub>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>&#x03B1;</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>&#x03B2;</mml:mi>
<mml:mo>&#x00B7;</mml:mo>
<mml:mtext mathvariant="italic">logit</mml:mtext>
<mml:mo stretchy="true">(</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mtext mathvariant="italic">predicted</mml:mtext>
</mml:msub>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</disp-formula><p>The model&#x2019;s performance was compared with existing prediction methods through net reclassification improvement (NRI) as shown in <xref ref-type="disp-formula" rid="EQ8">Equation (8)</xref>:</p><disp-formula id="EQ8">
<label>(8)</label>
<mml:math id="M36">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">NRI</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">up</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:msub>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">up</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:msub>
</mml:mfrac>
<mml:mo stretchy="true">)</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x2212;</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">down</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:msub>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mtext mathvariant="italic">down</mml:mtext>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:msub>
</mml:mfrac>
<mml:mo stretchy="true">)</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula><p>where <inline-formula>
<mml:math id="M37">
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi mathvariant="italic">up</mml:mi>
</mml:msub>
</mml:math>
</inline-formula> and <inline-formula>
<mml:math id="M38">
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mtext mathvariant="italic">down</mml:mtext>
</mml:msub>
</mml:math>
</inline-formula> represent the number of individuals with upward and downward risk reclassification, respectively. As demonstrated by Ekundayo et al. (<xref ref-type="bibr" rid="ref33">33</xref>), this approach provides a comprehensive assessment of the model&#x2019;s incremental value.</p>
<p>The integrated discrimination improvement (IDI) was calculated as shown in <xref ref-type="disp-formula" rid="EQ4">Equation (9)</xref>:</p><disp-formula id="EQ9">
<label>(9)</label>
<mml:math id="M39">
<mml:mi mathvariant="italic">IDI</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">new</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">new</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo stretchy="true">)</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mo stretchy="true">(</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">old</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">events</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo>
<mml:mover accent="true">
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi mathvariant="italic">old</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext mathvariant="italic">nonevents</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="true">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo stretchy="true">)</mml:mo>
</mml:math>
</disp-formula><p>where <inline-formula>
<mml:math id="M40">
<mml:mover accent="true">
<mml:mi>P</mml:mi>
<mml:mo stretchy="true">&#x00AF;</mml:mo>
</mml:mover>
</mml:math>
</inline-formula> represents the mean predicted probabilities. This metric, as validated by Delrue et al. (<xref ref-type="bibr" rid="ref34">34</xref>), quantifies the model&#x2019;s improved ability to separate events from non-events.</p>
<p>The comparative analysis results with existing methods are presented in <xref ref-type="table" rid="tab3">Table 3</xref>:</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption>
<p>Comparative analysis of different risk prediction models.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Method</th>
<th align="center" valign="top">AUC (95% CI)</th>
<th align="center" valign="top">Sensitivity</th>
<th align="center" valign="top">Specificity</th>
<th align="center" valign="top">NRI</th>
<th align="center" valign="top">IDI</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">Our Model</td>
<td align="center" valign="middle">89% (87&#x2013;91%)</td>
<td align="center" valign="middle">0.86</td>
<td align="center" valign="middle">0.82</td>
<td align="center" valign="middle">Reference</td>
<td align="center" valign="middle">Reference</td>
</tr>
<tr>
<td align="left" valign="middle">Traditional Cox</td>
<td align="center" valign="middle">82% (79&#x2013;85%)</td>
<td align="center" valign="middle">0.78</td>
<td align="center" valign="middle">0.76</td>
<td align="center" valign="middle">0.15&#x002A;</td>
<td align="center" valign="middle">0.08&#x002A;</td>
</tr>
<tr>
<td align="left" valign="middle">Standard ML</td>
<td align="center" valign="middle">85% (83&#x2013;87%)</td>
<td align="center" valign="middle">0.81</td>
<td align="center" valign="middle">0.79</td>
<td align="center" valign="middle">0.11&#x002A;</td>
<td align="center" valign="middle">0.06&#x002A;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><sup>&#x002A;</sup><italic>p&#x202F;&#x003C;</italic> 0.001.</p>
</table-wrap-foot>
</table-wrap>
<p>These comprehensive validation results demonstrate the robust performance and generalizability of our proposed model across different clinical settings and patient populations.</p>
</sec>
</sec>
<sec sec-type="results" id="sec7">
<label>3</label>
<title>Results</title>
<sec id="sec8">
<label>3.1</label>
<title>Baseline characteristics of the study population</title>
<p>The refracted demographic and clinical picture of the 1,200 participants was uncovered in the study cohort which revealed the risk underlying the decline in kidney function, as shown in <xref ref-type="fig" rid="fig4">Figure 4</xref> and <xref ref-type="table" rid="tab4">Table 4</xref>. Known characteristics of the study population exhibited significant variations between the progression and non-progression groups throughout multiple dimensions.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption>
<p>Baseline characteristics stratified by disease progression status.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g004.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Bar charts displaying baseline characteristics by disease progression status. (a) Age Distribution: similar ages for progression and non-progression groups. (b) Comorbidity Prevalence: higher percentages in progression group for hypertension, diabetes, and cardiovascular diseases. (c) eGFR Distribution: comparable eGFR levels for both groups. (d) UPCR Distribution: higher levels in progression group.</alt-text>
</graphic>
</fig>
<table-wrap position="float" id="tab4">
<label>Table 4</label>
<caption>
<p>Baseline characteristics of study participants.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th>Baseline characteristics</th>
<th align="left" valign="top">Characteristics</th>
<th align="center" valign="top">Overall (<italic>N</italic>=1,200)</th>
<th align="center" valign="top">Progression (<italic>n</italic>=432)</th>
<th align="center" valign="top">Non-progression (<italic>n</italic>=768)</th>
<th align="center" valign="top"><italic>p</italic>-value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle" rowspan="4">Demographic Characteristics</td>
<td align="left" valign="middle">Age, years&#x002A;</td>
<td align="center" valign="middle">62.5&#x202F;&#x00B1;&#x202F;13.4</td>
<td align="center" valign="middle">64.8&#x202F;&#x00B1;&#x202F;12.6</td>
<td align="center" valign="middle">61.2&#x202F;&#x00B1;&#x202F;13.8</td>
<td align="center" valign="middle">0.003</td>
</tr>
<tr>
<td align="left" valign="middle">Male sex, <inline-formula>
<mml:math id="M44">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula>(%)</td>
<td align="center" valign="middle">684 (57.0)</td>
<td align="center" valign="middle">259 (60.0)</td>
<td align="center" valign="middle">425 (55.3)</td>
<td align="center" valign="middle">0.124</td>
</tr>
<tr>
<td align="left" valign="middle">BMI, kg/m<sup>2</sup>&#x002A;</td>
<td align="center" valign="middle">25.8&#x202F;&#x00B1;&#x202F;4.2</td>
<td align="center" valign="middle">26.3&#x202F;&#x00B1;&#x202F;4.5</td>
<td align="center" valign="middle">25.5&#x202F;&#x00B1;&#x202F;4.0</td>
<td align="center" valign="middle">0.008</td>
</tr>
<tr>
<td align="left" valign="middle">Hypertension</td>
<td align="center" valign="middle">852 (71.0)</td>
<td align="center" valign="middle">334 (77.3)</td>
<td align="center" valign="middle">518 (67.4)</td>
<td align="center" valign="middle">&#x003C;0.001</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="2">Comorbidities, <inline-formula>
<mml:math id="M45">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula>(%)</td>
<td align="left" valign="middle">Diabetes</td>
<td align="center" valign="middle">456 (38.0)</td>
<td align="center" valign="middle">198 (45.8)</td>
<td align="center" valign="middle">258 (33.6)</td>
<td align="center" valign="middle">&#x003C;0.001</td>
</tr>
<tr>
<td align="left" valign="middle">CVD</td>
<td align="center" valign="middle">288 (24.0)</td>
<td align="center" valign="middle">129 (29.9)</td>
<td align="center" valign="middle">159 (20.7)</td>
<td align="center" valign="middle">0.001</td>
</tr>
<tr>
<td align="left" valign="middle" rowspan="5">Laboratory Parameters</td>
<td align="left" valign="middle">eGFR, mL/min/1.73m<sup>2</sup>&#x002A;</td>
<td align="center" valign="middle">45.8&#x202F;&#x00B1;&#x202F;15.6</td>
<td align="center" valign="middle">42.3&#x202F;&#x00B1;&#x202F;16.2</td>
<td align="center" valign="middle">47.8&#x202F;&#x00B1;&#x202F;14.9</td>
<td align="center" valign="middle">&#x003C;0.001</td>
</tr>
<tr>
<td align="left" valign="middle">Serum creatinine, mg/dL&#x002A;</td>
<td align="center" valign="middle">1.8&#x202F;&#x00B1;&#x202F;0.6</td>
<td align="center" valign="middle">2.0&#x202F;&#x00B1;&#x202F;0.7</td>
<td align="center" valign="middle">1.7&#x202F;&#x00B1;&#x202F;0.5</td>
<td align="center" valign="middle">&#x003C;0.001</td>
</tr>
<tr>
<td align="left" valign="middle">Albumin, g/dL&#x002A;</td>
<td align="center" valign="middle">3.9&#x202F;&#x00B1;&#x202F;0.5</td>
<td align="center" valign="middle">3.7&#x202F;&#x00B1;&#x202F;0.6</td>
<td align="center" valign="middle">4.0&#x202F;&#x00B1;&#x202F;0.4</td>
<td align="center" valign="middle">0.002</td>
</tr>
<tr>
<td align="left" valign="middle">UPCR, g/g&#x002A;</td>
<td align="center" valign="middle">1.8&#x202F;&#x00B1;&#x202F;2.1</td>
<td align="center" valign="middle">2.4&#x202F;&#x00B1;&#x202F;2.5</td>
<td align="center" valign="middle">1.5&#x202F;&#x00B1;&#x202F;1.8</td>
<td align="center" valign="middle">&#x003C;0.001</td>
</tr>
<tr>
<td align="left" valign="middle">Hemoglobin, g/dL&#x002A;</td>
<td align="center" valign="middle">11.8&#x202F;&#x00B1;&#x202F;1.9</td>
<td align="center" valign="middle">11.4&#x202F;&#x00B1;&#x202F;2.0</td>
<td align="center" valign="middle">12.0&#x202F;&#x00B1;&#x202F;1.8</td>
<td align="center" valign="middle">0.004</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The symbol &#x002A; indicates that the data represents mean &#x00B1; standard deviation (SD) for continuous variables.</p>
</table-wrap-foot>
</table-wrap>
<p>The mean age of the progression group was significantly higher at 64.8&#x202F;&#x00B1;&#x202F;12.6&#x202F;years than the non-progression group&#x2019;s mean age of 61.2&#x202F;&#x00B1;&#x202F;13.8&#x202F;years (<inline-formula>
<mml:math id="M46">
<mml:mi>p</mml:mi>
</mml:math>
</inline-formula> = 0.003). This difference in age distribution proved to be statistically significant, as depicted in <xref ref-type="fig" rid="fig4">Figure 4</xref>. This finding indicates age may be an influencing factor for kidney function deterioration.</p>
<p>Comorbidity analysis showed that Hypertension had the most pronounced difference, affecting 77.3% of the progression group versus 67.4% of the non-progression group (<inline-formula>
<mml:math id="M47">
<mml:mi>p</mml:mi>
</mml:math>
</inline-formula> &#x003C; 0.001). The burden of chronic conditions analysed together proved to be markedly higher in the progression group (45.8%) than the non-progression group (33.6%) in diabetes with a statistical difference (<inline-formula>
<mml:math id="M48">
<mml:mi>p</mml:mi>
</mml:math>
</inline-formula> &#x003C; 0.001). CVD followed this trend with a 29.9% prevalence in the progression group compared to 20.7% in the non-progression group (<inline-formula>
<mml:math id="M49">
<mml:mi>p</mml:mi>
</mml:math>
</inline-formula> = 0.001).</p>
<p>The intricate metabolic signatures distinguishing progression trajectories are shown on <xref ref-type="fig" rid="fig4">Figure 4</xref> and <xref ref-type="table" rid="tab4">Table 4</xref>&#x2019;s laboratory parameters. The estimated glomerular filtration rate (eGFR) divergence was noteworthy with the lower values of the progression group (42.3&#x202F;&#x00B1;&#x202F;16.2&#x202F;mL/min/1.73m<sup>2</sup>), when compared to the non-progression group&#x2019;s 47.8&#x202F;&#x00B1;&#x202F;14.9&#x202F;mL/min/1.73m<sup>2</sup>. This difference illustrates the importance of renal function indicators in predicting the progression of disease.</p>
<p>The urinary protein-to-creatinine ratio (UPCR) provided additional clarity into the already intricate terrain of the decline in kidney function. As depicted in <xref ref-type="fig" rid="fig4">Figure 4</xref>, the progression group had a higher average UPCR which corresponds to higher proteinuria and possible renal injury. These biochemical differences offer important information about the mechanisms of kidney function decline.</p>
<p>The analysis of the cohort&#x2019;s baseline characteristics is comprehensive in scope and illustrates the multifactorial aspect of kidney function decline. The differences were statistically significant and spread across demographic, comorbidity, and laboratory parameters, which adds to the depth of renal disease progression. This nuanced characterization provides not only a complex snapshot of the population, but also an understanding that goes beyond the mechanisms of renal function decline, which is unprecedented for the machine learning model&#x2019;s predictive architecture.</p>
<p>The graph shows the distribution of a cohort&#x2019;s baseline characteristics which include age, comorbidity burden, estimated glomerular filtration rate (eGFR), and urinary protein to creatinine ratio (UPCR) in both progression and non-progression groups and their correlates.</p>
</sec>
<sec id="sec9">
<label>3.2</label>
<title>Model performance evaluation</title>
<p>The evaluations conducted on the machine learning model showed predictive power on all the metrics. The ensemble model, as predicted by the receiver operating characteristic (ROC) analysis shown in <xref ref-type="fig" rid="fig5">Figure 5a</xref>, was found to have better discrimination ability than the individual base learners. The ensemble model attained an area under the ROC curve (AUC) of 0.89 (95% CI: 0.87&#x2013;0.91), which was much higher than the isolating cases of random forest (AUC: 0.85, 95% CI: 0.83&#x2013;0.87) and XGBoost (AUC: 0.87, 95% CI: 0.85&#x2013;0.89) models, and even outperformed them in recurrent measures.</p>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption>
<p>Comprehensive evaluation of model performance in predicting kidney function decline. <bold>(A)</bold> Receiver Operating Characteristic (ROC) curve analysis comparing the ensemble model to individual base learners. <bold>(B)</bold> Calibration plot demonstrating the alignment between predicted and observed risks. <bold>(C)</bold> Confusion matrix illustrating classification performance with true positives, true negatives, false positives, and false negatives.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g005.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Three-panel image showing ROC curves, calibration curves, and a confusion matrix. Panel (a) displays ROC curves for Ensemble, Random Forest, LightGBM, and XGBoost models, with the area under the curve (AUC) values of 0.89, 0.85, 0.86, and 0.87, respectively. Panel (b) displays calibration curves for an Ensemble model compared to perfect and typical models. Panel (c) shows a confusion matrix with true labels on the y-axis and predicted labels on the x-axis, featuring 97 true negatives, 3 false positives, 24 false negatives, and 27 true positives.</alt-text>
</graphic>
</fig>
<p>These block figures include, but are not limited to, the performance metrics of the single models in comparison to the ensemble model for their different instances at various datasets as mentioned in <xref ref-type="table" rid="tab5">Table 5</xref>.</p>
<table-wrap position="float" id="tab5">
<label>Table 5</label>
<caption>
<p>Comprehensive performance metrics across different datasets.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Performance metric</th>
<th align="center" valign="top">Training set (95% CI)</th>
<th align="center" valign="top">Validation set (95% CI)</th>
<th align="center" valign="top">Test set (95% CI)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">AUC</td>
<td align="center" valign="middle">89% (87&#x2013;91%)</td>
<td align="center" valign="middle">87% (85&#x2013;89%)</td>
<td align="center" valign="middle">86% (84&#x2013;88%)</td>
</tr>
<tr>
<td align="left" valign="middle">Sensitivity</td>
<td align="center" valign="middle">88% (85&#x2013;91%)</td>
<td align="center" valign="middle">86% (83&#x2013;89%)</td>
<td align="center" valign="middle">85% (82&#x2013;88%)</td>
</tr>
<tr>
<td align="left" valign="middle">Specificity</td>
<td align="center" valign="middle">84% (81&#x2013;87%)</td>
<td align="center" valign="middle">83% (80&#x2013;86%)</td>
<td align="center" valign="middle">82% (79&#x2013;85%)</td>
</tr>
<tr>
<td align="left" valign="middle">PPV</td>
<td align="center" valign="middle">85% (82&#x2013;88%)</td>
<td align="center" valign="middle">84% (81&#x2013;87%)</td>
<td align="center" valign="middle">83% (80&#x2013;86%)</td>
</tr>
<tr>
<td align="left" valign="middle">NPV</td>
<td align="center" valign="middle">87% (84&#x2013;90%)</td>
<td align="center" valign="middle">85% (82&#x2013;88%)</td>
<td align="center" valign="middle">84% (81&#x2013;87%)</td>
</tr>
<tr>
<td align="left" valign="middle">F1 Score</td>
<td align="center" valign="middle">86% (84&#x2013;88%)</td>
<td align="center" valign="middle">85% (83&#x2013;87%)</td>
<td align="center" valign="middle">84% (82&#x2013;86%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The machine learning ensemble model offers a transformative tool for predicting renal function decline in chronic kidney disease (CKD), providing clinicians with reliable and actionable insights for personalized care. The calibration analysis (<xref ref-type="fig" rid="fig5">Figure 5b</xref>) demonstrates the model&#x2019;s exceptional reliability, with predicted risks closely mirroring actual outcomes across the entire risk spectrum. With a calibration slope of 0.96 (95% CI: 0.94&#x2013;0.98) and an intercept of 0.02 (95% CI: 0.01&#x2013;0.03), the model exhibits minimal bias, ensuring that clinicians can confidently use its risk estimates to guide treatment decisions. This robust calibration means that a predicted 30% risk of CKD progression accurately reflects the true likelihood, enabling precise patient counseling and intervention planning. The confusion matrix in <xref ref-type="fig" rid="fig5">Figure 5c</xref> demonstrates the classification performance of the model in predicting the risk of renal function decline, reflecting its performance on true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). The confusion matrix matrix shows that the model has similar accuracy in predicting true positives and true negatives, indicating its balanced performance in distinguishing cases of renal function decline from non-decline cases.</p>
<p>The result reflects their predictive power of the ensemble model&#x2019;s reliability and performance regarding decline in kidney function. All of its aspects, including calibration, discrimination, subgroup performance, and validation, re-confirm the model&#x2019;s effectiveness in integrating early risk assessment and intervention within clinical practice.</p>
</sec>
<sec id="sec10">
<label>3.3</label>
<title>Analysis of the results</title>
<p>As noted previously with the kidney pathology overview, the deep dive into the kidney function decline risk analysis illustrated the intricate interrelationships of several clinical elements and their predictive effects. Feature importance and their multitude of permutations is illustrated in <xref ref-type="fig" rid="fig6">Figure 6</xref>.</p>
<fig position="float" id="fig6">
<label>Figure 6</label>
<caption>
<p>Feature importance and interaction analysis. <bold>(a)</bold> SHAP feature importance; <bold>(b)</bold> feature distribution; <bold>(c)</bold> eGFR SHAP dependence; <bold>(d)</bold> feature interaction.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g006.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Feature importance and interaction analysis with four panels: (a) A bar chart showing SHAP feature importance, highlighting eGFR, Age, and UPCR as significant. (b) Box plots displaying feature distribution for eGFR, Age, UPCR, and Creatinine. (c) A line graph illustrating eGFR SHAP dependence, showing an increasing trend. (d) A heatmap of feature interactions indicating strong relationships, with higher values in eGFR and Age.</alt-text>
</graphic>
</fig>
<p>The SHAP importance deconstruction revealed an obvious importance structure related to predictive factors. Estimated glomerular filtration rate (eGFR) was by far the most pivotal predictor, as expected from the magnitude of the SHAP value, followed by age, and urinary protein creatinine ratio (UPCR). The above highlighted aspects reinforce the complexity involved in the kidney function decline risk assessment, which is profoundly multifactorial. <xref ref-type="table" rid="tab6">Table 6</xref> captures the overview of the most key risk factors with their clinical importance in detail.</p>
<table-wrap position="float" id="tab6">
<label>Table 6</label>
<caption>
<p>Key risk factors and clinical significance.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Risk factor</th>
<th align="left" valign="top">Importance ranking</th>
<th align="left" valign="top">Clinical significance</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">eGFR</td>
<td align="left" valign="middle">Highest</td>
<td align="left" valign="middle">Primary indicator of kidney function</td>
</tr>
<tr>
<td align="left" valign="middle">Age</td>
<td align="left" valign="middle">Second</td>
<td align="left" valign="middle">Modulates disease progression risk</td>
</tr>
<tr>
<td align="left" valign="middle">UPCR</td>
<td align="left" valign="middle">Third</td>
<td align="left" valign="middle">Reflects kidney damage and proteinuria</td>
</tr>
<tr>
<td align="left" valign="middle">Comorbidities</td>
<td align="left" valign="middle">Fourth</td>
<td align="left" valign="middle">Indicates systemic health impact</td>
</tr>
<tr>
<td align="left" valign="middle">Creatinine</td>
<td align="left" valign="middle">Fifth</td>
<td align="left" valign="middle">Supplementary renal function marker</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As seen in <xref ref-type="fig" rid="fig6">Figure 6b</xref>, there was a feature distribution boxplot that showed the differences which existed among some clinical parameters. The distributions for eGFR and age displayed greater variation which indicates how multi-faceted and variable these parameters are within the scope of kidney function evaluation. A non-linear relationship was demonstrated in eGFR&#x2019;s SHAP dependence plot in <xref ref-type="fig" rid="fig6">Figure 6c</xref>, which underlined the decline in kidney function&#x2019;s intricate mechanisms.</p>
<p>In <xref ref-type="fig" rid="fig6">Figure 6d</xref>, feature interaction analysis showed important dependencies of some clinical markers. The interaction heatmap showed strong, and even moderate, differences especially with eGFR, age and UPCR. Such relations indicate that the decline in kidney function is not the result of a singular issue, rather, it is a product of many interacting physiological parameters.</p>
<p>In particular, <xref ref-type="fig" rid="fig7">Figure 7</xref> highlights a complete interpretation framework for clinical risk. The compositional analysis of risk factors contribution waterfall plot in <xref ref-type="fig" rid="fig7">Figure 7a</xref> showed risk contributions were cumulative where baseline characteristics and central clinical features adjusted the risk over time. The prediction probability distribution in <xref ref-type="fig" rid="fig7">Figure 7b</xref> was able to distinctly classify patients into three groups: low, medium, and high risk, which was very useful for personalized risk evaluation. <xref ref-type="fig" rid="fig7">Figure 7c</xref> shows the scatter plot of risk features against predicted risk wherein the correlation was highly positive with the multicolored risk indicators representing the constructs of interest. Risk stratification within subgroups in <xref ref-type="fig" rid="fig7">Figure 7d</xref> demonstrated that there was heterogeneity among the different patient populations, notably higher risk probabilities for elderly patients and those with multiple comorbidities.</p>
<fig position="float" id="fig7">
<label>Figure 7</label>
<caption>
<p>Clinical risk interpretation and stratification. <bold>(a)</bold> Risk factors contribution; <bold>(b)</bold> prediction probability distribution; <bold>(c)</bold> features vs. prediction results; <bold>(d)</bold> risk stratification by subgroups.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g007.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Clinical risk interpretation and stratification visualization includes: (a) bar and line graphs showing risk factors contributions like eGFR and age; (b) box plots displaying prediction probability distribution across low, medium, and high risk categories; (c) scatter plot with regression line illustrating features versus prediction results related to eGFR; (d) error bar chart depicting risk stratification by subgroups such as young, middle-aged, elderly, and comorbid.</alt-text>
</graphic>
</fig>
<p>Offers quantifiable metrics alongside intricate biological explanations for a process that has remained largely qualitative. Along with providing an innovative means of risk identification and intervention, this sophisticated analysis brings a new dimension for understanding the decline of kidney functions due to advanced age. The integration of publicly available healthcare datasets along with augmented machine learning enables doctors to implement shifts in clinical paradigms more quickly than before.</p>
<p>Pioneers a new era in computing and healthcare integration by offering precise measures to counteract the deterioration of kidney functions. This will provide room for further innovation that challenges existing practices in nephrology.</p>
</sec>
<sec id="sec11">
<label>3.4</label>
<title>Comparison with traditional methods</title>
<p>In comparison with conventional approaches, the analysis carried out between our proposed Machine Learning model and other techniques showed great improvements in predictability and clinical usefulness. <xref ref-type="fig" rid="fig8">Figure 8</xref> clearly shows that the residual plot displays a normal distribution of errors centered around 0. The traditional method had an AUC of 0.695 from the ROC curve analysis, and integration with the calibration plots showed exceptionally good agreement between predicted and observed probabilities across the entire risk spectrum.</p>
<fig position="float" id="fig8">
<label>Figure 8</label>
<caption>
<p>Comprehensive model performance analysis. <bold>(a)</bold> Error distribution histogram; <bold>(b)</bold> ROC curve analysis (AUC&#x202F;=&#x202F;0.695); <bold>(c)</bold> calibration plot with confidence intervals.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g008.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Panel (a) shows a bar chart of prediction errors with frequency on the y-axis and error ranging from negative twenty to twenty on the x-axis. Panel (b) features a Receiver Operating Characteristic (ROC) curve with an AUC of zero point six eight five, plotting true positive rate against false positive rate. Panel (c) displays a calibration plot with observed probability versus predicted probability, indicating alignment along the diagonal.</alt-text>
</graphic>
</fig>
<p>The benefits of advanced clinical applications are depicted thoroughly in <xref ref-type="fig" rid="fig9">Figure 9</xref>, as multi-layered performance analytics outlines how much more performant our suggested ML model is relative to both Cox and standard ML models. The accuracy evaluation by strata reveals performance consistency across different patient subgroups, as well as showing enhanced ability to predict the passage of time in regard to disease progression. The cost-effectiveness analysis also confirms the projected practical benefits for our approach from the standpoint of actual clinical use.</p>
<fig position="float" id="fig9">
<label>Figure 9</label>
<caption>
<p>Advanced clinical application advantages. <bold>(a)</bold> Multi-dimensional performance analysis comparing proposed ML, Cox model, and standard ML; <bold>(b)</bold> stratified accuracy analysis across patient subgroups; <bold>(c)</bold> disease progression dynamics with time-to-progression analysis; <bold>(d)</bold> cost-effectiveness analysis across different implementation aspects.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g009.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Four graphs showing clinical application analyses. (a) Bar graph comparing performance scores for Prediction Time, Computational Complexity, Memory Usage, and Scalability among Proposed ML, Cox Model, and Standard ML. (b) Box plots illustrating accuracy across age groups: Young, Middle-aged, Elderly, and Comorbid. (c) Line graph showing Disease Progression Dynamics, with patient count and cumulative risk over time. (d) Bar graph depicting Cost-Effectiveness Analysis for Direct Costs, System Integration, Implementation Cost, and Risk Reduction among Proposed ML, Cox Model, and Standard ML.</alt-text>
</graphic>
</fig>
<p>Our Proposed Solutions: Enhanced Capabilities The detailed performance metrics pertaining to diverse methodological approaches have been presented in <xref ref-type="table" rid="tab7">Table 7</xref>. <xref ref-type="table" rid="tab8">Table 8</xref> presents the detailed training methods and model parameters of the baseline models, Transformer and RNN.</p>
<table-wrap position="float" id="tab7">
<label>Table 7</label>
<caption>
<p>Comparative performance analysis of different prediction models for kidney function decline.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Evaluation metric</th>
<th align="center" valign="top">Proposed ML model</th>
<th align="center" valign="top">Traditional cox model</th>
<th align="center" valign="top">Standard ML</th>
<th align="center" valign="top">Transformer</th>
<th align="center" valign="top">RNN</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">AUC</td>
<td align="center" valign="middle">87.9% (85.6&#x2013;90.2%)</td>
<td align="center" valign="middle">69.5% (67.2&#x2013;71.8%)</td>
<td align="center" valign="middle">78.2% (75.9&#x2013;80.5%)</td>
<td align="center" valign="middle">87.0% (84.7&#x2013;89.3%)</td>
<td align="center" valign="middle">81.0% (78.7&#x2013;83.3%)</td>
</tr>
<tr>
<td align="left" valign="middle">Prediction Time (s)</td>
<td align="center" valign="middle">0.48&#x202F;&#x00B1;&#x202F;0.05</td>
<td align="center" valign="middle">1.86&#x202F;&#x00B1;&#x202F;0.12</td>
<td align="center" valign="middle">0.92&#x202F;&#x00B1;&#x202F;0.08</td>
<td align="center" valign="middle">3.35&#x202F;&#x00B1;&#x202F;0.10</td>
<td align="center" valign="middle">2.95&#x202F;&#x00B1;&#x202F;0.07</td>
</tr>
<tr>
<td align="left" valign="middle">Resource Utilization (%)</td>
<td align="center" valign="middle">28.5&#x202F;&#x00B1;&#x202F;3.2</td>
<td align="center" valign="middle">72.3&#x202F;&#x00B1;&#x202F;5.1</td>
<td align="center" valign="middle">45.7&#x202F;&#x00B1;&#x202F;4.3</td>
<td align="center" valign="middle">65.0&#x202F;&#x00B1;&#x202F;4.5</td>
<td align="center" valign="middle">59.0&#x202F;&#x00B1;&#x202F;4.0</td>
</tr>
<tr>
<td align="left" valign="middle">Implementation Cost&#x002A;</td>
<td align="center" valign="middle">0.65&#x202F;&#x00B1;&#x202F;0.07</td>
<td align="center" valign="middle">1.00&#x202F;&#x00B1;&#x202F;0.00</td>
<td align="center" valign="middle">0.82&#x202F;&#x00B1;&#x202F;0.05</td>
<td align="center" valign="middle">0.92&#x202F;&#x00B1;&#x202F;0.06</td>
<td align="center" valign="middle">0.87&#x202F;&#x00B1;&#x202F;0.05</td>
</tr>
<tr>
<td align="left" valign="middle">Scalability Index&#x2020;</td>
<td align="center" valign="middle">0.92&#x202F;&#x00B1;&#x202F;0.03</td>
<td align="center" valign="middle">0.45&#x202F;&#x00B1;&#x202F;0.05</td>
<td align="center" valign="middle">0.67&#x202F;&#x00B1;&#x202F;0.04</td>
<td align="center" valign="middle">0.83&#x202F;&#x00B1;&#x202F;0.04</td>
<td align="center" valign="middle">0.72&#x202F;&#x00B1;&#x202F;0.04</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>&#x002A;Normalized to traditional Cox model cost (1.00). &#x2020;Measured on a scale of 0&#x2013;1, where 1 represents optimal scalability.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="tab8">
<label>Table 8</label>
<caption>
<p>Model architectures and training details.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Model</th>
<th align="left" valign="top">Training method</th>
<th align="left" valign="top">Hyperparameter settings</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">Transformer</td>
<td align="left" valign="middle">
<list list-type="bullet">
<list-item>
<p>Adam optimizer</p>
</list-item>
<list-item>
<p>Learning rate warm-up and decay strategy</p>
</list-item>
<list-item>
<p>Negative log-likelihood loss function</p>
</list-item>
</list>
</td>
<td align="left" valign="middle">
<list list-type="bullet">
<list-item>
<p>Hidden dimension: 256</p>
</list-item>
<list-item>
<p>Number of heads: 4</p>
</list-item>
<list-item>
<p>Dropout rate: 0.1</p>
</list-item>
<list-item>
<p>Learning rate: 1e-4</p>
</list-item>
<list-item>
<p>Batch size: 32</p>
</list-item>
<list-item>
<p>Epochs: 50</p>
</list-item>
</list>
</td>
</tr>
<tr>
<td align="left" valign="middle">RNN</td>
<td align="left" valign="middle">
<list list-type="bullet">
<list-item>
<p>SGD optimizer</p>
</list-item>
<list-item>
<p>Mean squared error loss function</p>
</list-item>
</list>
</td>
<td align="left" valign="middle">
<list list-type="bullet">
<list-item>
<p>Hidden dimension: 256</p>
</list-item>
<list-item>
<p>Dropout rate: 0.2</p>
</list-item>
<list-item>
<p>Learning rate: 0.01</p>
</list-item>
<list-item>
<p>Batch size: 64</p>
</list-item>
<list-item>
<p>Epochs: 50</p>
</list-item>
</list>
</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The formula for calculating Resource Utilization is shown in <xref ref-type="disp-formula" rid="EQ10">Equation 10</xref>,<inline-formula>
<mml:math id="M50">
<mml:mspace width="0.25em"/>
<mml:msub>
<mml:mtext>Resource Usage</mml:mtext>
<mml:mtext>proposed</mml:mtext>
</mml:msub>
<mml:mspace width="0.25em"/>
</mml:math>
</inline-formula>represents the cost of training and inference for the proposed machine learning model (an ensemble model based on Random Forest, XGBoost, and LightGBM) on the cloud service platform. <inline-formula>
<mml:math id="M51">
<mml:mtext>Resource Usage</mml:mtext>
</mml:math>
</inline-formula> represents the cost of training and reasoning a traditional Cox proportional hazards regression model on a cloud service platform under the same conditions.</p>
<p>Scalability Index is used to measure the performance stability of a model in different dataset sizes or clinical scenarios. The specific calculation formula is shown in <xref ref-type="disp-formula" rid="EQ11">Equation 11</xref>. Among them, a reflects the degree of fluctuation of the AUC index of the model in different scale datasets, measuring its predictive stability in different datasets or scenarios. <inline-formula>
<mml:math id="M52">
<mml:mo>max</mml:mo>
<mml:mspace width="0.25em"/>
<mml:mtext>Variance</mml:mtext>
</mml:math>
</inline-formula> represents the maximum value of performance variance. This offers quantitative proof supporting the improved features of our model.</p><disp-formula id="EQ10">
<label>(10)</label>
<mml:math id="M53">
<mml:mtable equalrows="true" equalcolumns="true" displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mspace width="0.33em"/>
<mml:msub>
<mml:mtext>Cost</mml:mtext>
<mml:mi>norm</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mspace width="0.33em"/>
<mml:msub>
<mml:mtext>Resource Usage</mml:mtext>
<mml:mtext>proposed</mml:mtext>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:mspace width="0.33em"/>
<mml:mtext>Resource Usage</mml:mtext>
<mml:mspace width="0.33em"/>
</mml:mrow>
</mml:mfrac>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula><disp-formula id="EQ11">
<label>(11)</label>
<mml:math id="M54">
<mml:mtable equalrows="true" equalcolumns="true" displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mtext>Scalability Index</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mspace width="0.33em"/>
<mml:mtext>Performance Variance</mml:mtext>
<mml:mspace width="0.33em"/>
</mml:mrow>
<mml:mrow>
<mml:mspace width="0.33em"/>
<mml:mo>max</mml:mo>
<mml:mspace width="0.33em"/>
<mml:mtext>Variance</mml:mtext>
<mml:mspace width="0.33em"/>
</mml:mrow>
</mml:mfrac>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula><p>The AUC of our proposed ensemble model reached 0.879 (95% CI: 0.856&#x2013;0.902), significantly outperforming the traditional Cox model&#x2019;s AUC of 0.695 (95% CI: 0.672&#x2013;0.718), standard ML&#x2019;s AUC of 0.782 (95% CI: 0.759&#x2013;0.805), the Transformer model&#x2019;s AUC of 0.870 (95% CI: 0.847&#x2013;0.893), and the RNN model&#x2019;s AUC of 0.810 (95% CI: 0.787&#x2013;0.833). In terms of computation time, our model achieved a prediction time of 0.48&#x202F;&#x00B1;&#x202F;0.05&#x202F;s, a 74.2% improvement over the Cox model&#x2019;s 1.86&#x202F;&#x00B1;&#x202F;0.12&#x202F;s, and was notably faster than the Transformer (3.35&#x202F;&#x00B1;&#x202F;0.10&#x202F;s) and RNN (2.95&#x202F;&#x00B1;&#x202F;0.07&#x202F;s) models, which were less efficient than even the standard ML model (0.92&#x202F;&#x00B1;&#x202F;0.08&#x202F;s). Resource utilization was optimized by 60.6% compared to the Cox model (28.5&#x202F;&#x00B1;&#x202F;3.2% vs. 72.3&#x202F;&#x00B1;&#x202F;5.1%), with our model also outperforming the standard ML (45.7&#x202F;&#x00B1;&#x202F;4.3%), Transformer (65.0&#x202F;&#x00B1;&#x202F;4.5%), and RNN (59.0&#x202F;&#x00B1;&#x202F;4.0%) models. The calibration slope of 0.96 (95% CI: 0.94&#x2013;0.98) underscored the model&#x2019;s reliability, with minimal discrepancy between predicted and observed risks, confirming its excellent performance in risk stratification for kidney function decline.</p>
<p>Cost-effectiveness analysis revealed a 35% reduction in implementation cost for our proposed ensemble model (0.65&#x202F;&#x00B1;&#x202F;0.07) compared to the traditional Cox model (1.00&#x202F;&#x00B1;&#x202F;0.00), outperforming the standard ML model (0.82&#x202F;&#x00B1;&#x202F;0.05), Transformer (0.92&#x202F;&#x00B1;&#x202F;0.06), and RNN (0.87&#x202F;&#x00B1;&#x202F;0.05) models, while maintaining superior predictive accuracy (AUC: 0.879). The scalability index of 0.92&#x202F;&#x00B1;&#x202F;0.03 demonstrated robust performance across varying dataset sizes, significantly surpassing the Cox model (0.45&#x202F;&#x00B1;&#x202F;0.05), standard ML (0.67&#x202F;&#x00B1;&#x202F;0.04), Transformer (0.83&#x202F;&#x00B1;&#x202F;0.04), and RNN (0.72&#x202F;&#x00B1;&#x202F;0.04) models. These results, supported by rigorous internal and external validation (<xref ref-type="table" rid="tab2">Tables 2</xref>, <xref ref-type="table" rid="tab3">3</xref>, <xref ref-type="table" rid="tab7">7</xref>), highlight the model&#x2019;s efficiency and generalizability, positioning it as a highly viable tool for clinical integration across diverse settings to predict kidney function decline risk effectively.</p>
</sec>
<sec id="sec12">
<label>3.5</label>
<title>Clinical case analysis</title>
<p>To enhance the clinical utility of the model, we provided interpretability through SHAP analysis and further demonstrated its application in clinical decision-making through case snippets and integration strategies with electronic health records (EHR). The following case illustrates how the model prediction can guide personalized management of patients with chronic kidney disease (CKD).</p>
</sec>
</sec>
<sec sec-type="discussion" id="sec13">
<label>4</label>
<title>Discussion</title>
<p>Our analysis reveals important implications for clinical practice and offers some insights into the splendid capability of machine learning techniques in predicting the decline of kidney functions. The efficacy of our ensemble model, which achieved an AUC of 0.89 (95% CI: 0.87&#x2013;0.91), confirms that the integration of numerous machine learning algorithms for intricate clinical forecasts is effective (<xref ref-type="bibr" rid="ref35">35</xref>). The performance of this ensemble model significantly exceeds the AUC of conventional statistical approaches and outliers in progression prognosis for chronic kidney disease, signifying an advancement in stratification competence. Previous studies have reported the optimization of risk stratification due to the incorporation of electronic health records with machine learning algorithms (<xref ref-type="bibr" rid="ref36">36</xref>). Our findings further confirm this approach through extensive validation across numerous clinical settings (<xref ref-type="table" rid="tab9">Table 9</xref>).</p>
<table-wrap position="float" id="tab9">
<label>Table 9</label>
<caption>
<p>Analysis table of clinical cases of different patients.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Patient information</th>
<th align="left" valign="top">Risk</th>
<th align="left" valign="top">Management suggestions</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">A 65&#x202F;year old male patient with eGFR of 45&#x202F;mL/min/1.73&#x202F;m <sup>2</sup> and urinary protein creatinine ratio (UPCR) of 2.8&#x202F;g/g, accompanied by hypertension and diabetes</td>
<td align="left" valign="middle">High</td>
<td align="left" valign="middle">(1) Adjust antihypertensive drugs and prioritize the use of ACE inhibitors to reduce proteinuria; (2) Strengthen blood glucose control and optimize insulin treatment plan; (3) Arrange follow-up visits every 3&#x202F;months to monitor changes in eGFR and UPCR.</td>
</tr>
<tr>
<td align="left" valign="top">A 45&#x202F;year old female patient with eGFR of 55&#x202F;mL/min/1.73&#x202F;m <sup>2</sup> and UPCR of 0.5&#x202F;g/g, without significant comorbidities</td>
<td align="left" valign="middle">Low</td>
<td align="left" valign="top">Choose to continue with routine monitoring and follow up every 6&#x202F;months</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The application of ensemble frameworks to provide the merging of several algorithms is one of the changes we made to the machine learning application in clinical prediction. This approach is one of the numerous solutions to the many challenges faced in healthcare predictive modeling (<xref ref-type="bibr" rid="ref37">37</xref>, <xref ref-type="bibr" rid="ref38">38</xref>). The model&#x2019;s outstanding calibration (slope: 0.96, 95% CI: 0.94&#x2013;0.98) illustrates a considerable leap in addressing the remaining issues of implementing machine learning in healthcare (<xref ref-type="bibr" rid="ref39">39</xref>). The reliability of artificial intelligence in predicting the worsening of kidney diseases is known to be high (<xref ref-type="bibr" rid="ref40">40</xref>). Our results offer substantial proof toward the adoption of these findings into clinical work.</p>
<p>In this research, there has been remarkable progress, but there are still some areas that require further attention. First, the adoption of deep learning techniques, as well as the threat of data leakage (<xref ref-type="bibr" rid="ref41">41</xref>, <xref ref-type="bibr" rid="ref42">42</xref>), both warrant further exploration. More efforts need to be directed at potentially overfitting the models in immunology (<xref ref-type="bibr" rid="ref43">43</xref>) and at the same time increasing the scope of the model to include new biomarker and genetic influences. For instance, &#x00C7;i&#x00E7;ek et al. (<xref ref-type="bibr" rid="ref44">44</xref>) demonstrated that preoperative neopterin levels can predict acute kidney injury in on-pump cardiac surgery, highlighting the critical role of biomarker-driven risk stratification in kidney outcomes. This supports our proposition to incorporate novel biomarkers, such as neopterin or other inflammatory markers, to enhance the phenomenological capabilities of our model for CKD progression. The development of artificial intelligence in medicine (<xref ref-type="bibr" rid="ref45">45</xref>) presents new possibilities for the inclusion of other features such as genomic and proteomic markers that would improve the model&#x2019;s phenomenological capabilities. This study suggests the usage of automated methods for model updating, uniform data gathering from clinics, and the creation of clear multi-center validation procedures as the focus of future work. The addition of real-time clinical decision support systems and the extension of the model functionality to new emerging biomarkers is the next crucial step in the progression of this area.</p>
<p>Machine learning offers extraordinary promise for transforming the prediction of kidney function decline, which is why a great many obstacles still need to be solved before we can implement our research in a clinical setting. Our research substantiates machine learning and kidney pathology by laying out the groundwork for personalized medicine and data-centric healthcare decision-making in nephrology. As further changes in the healthcare system occur, our model will be more useful in enhancing the quality of care provided and in the efficient use of resources for chronic kidney disease treatment and prognosis.</p>
<sec id="sec14">
<label>4.1</label>
<title>Model stability analysis results</title>
<p><xref ref-type="table" rid="tab10">Table 10</xref> shows the stability performance of the integrated model in predicting the risk of renal function decline. The AUC stability of the model is 0.87&#x202F;&#x00B1;&#x202F;0.02, with a coefficient of variation (CV) of only 2.3%, indicating that its predictive performance is highly consistent across multiple runs. Sensitivity (0.86&#x202F;&#x00B1;&#x202F;0.03, CV&#x202F;=&#x202F;3.5%) and specificity (0.84&#x202F;&#x00B1;&#x202F;0.03, CV&#x202F;=&#x202F;3.6%) also showed low volatility, demonstrating the robustness of the model on different datasets. The calibration slope (0.96, 95% CI: 0.94&#x2013;0.98, CV&#x202F;=&#x202F;2.1%) and intercept (0.02, 95% CI: 0.01&#x2013;0.03, CV&#x202F;=&#x202F;1.8%) further confirmed the high consistency between the model predictions and actual results. These results indicate that the model can maintain reliable predictive performance in different operational and clinical scenarios, and is suitable for a wide range of clinical applications.</p>
<table-wrap position="float" id="tab10">
<label>Table 10</label>
<caption>
<p>Model stability analysis results.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Stability metric</th>
<th align="center" valign="top">Value (95% CI)</th>
<th align="center" valign="top">Coefficient of variation (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">AUC Stability</td>
<td align="center" valign="middle">0.87&#x202F;&#x00B1;&#x202F;0.02</td>
<td align="center" valign="middle">2.3</td>
</tr>
<tr>
<td align="left" valign="middle">Sensitivity Stability</td>
<td align="center" valign="middle">0.86&#x202F;&#x00B1;&#x202F;0.03</td>
<td align="center" valign="middle">3.5</td>
</tr>
<tr>
<td align="left" valign="middle">Specificity Stability</td>
<td align="center" valign="middle">0.84&#x202F;&#x00B1;&#x202F;0.03</td>
<td align="center" valign="middle">3.6</td>
</tr>
<tr>
<td align="left" valign="middle">Calibration Slope</td>
<td align="center" valign="middle">0.96 (0.94&#x2013;0.98)</td>
<td align="center" valign="middle">2.1</td>
</tr>
<tr>
<td align="left" valign="middle">Calibration Intercept</td>
<td align="center" valign="middle">0.02 (0.01&#x2013;0.03)</td>
<td align="center" valign="middle">1.8</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec15">
<label>4.2</label>
<title>Comprehensive evaluation of model performance</title>
<p>The decision curve analysis (DCA, <xref ref-type="fig" rid="fig5">Figure 5d</xref>) highlights the model&#x2019;s practical utility in clinical settings. It shows a substantial net benefit over default strategies of treating all or no patients, particularly in the 20&#x2013;60% risk range, where clinical decisions are most critical. For example, in this range, the model helps clinicians identify patients who would benefit most from intensified monitoring or early interventions, such as medication adjustments, while sparing low-risk patients unnecessary treatments. This targeted approach optimizes resource use and enhances patient outcomes by focusing efforts where they are most needed.</p>
<p>Subgroup analyses (<xref ref-type="fig" rid="fig5">Figure 5e</xref>) further underscore the model&#x2019;s versatility across diverse patient populations, with outstanding performance in high-risk groups. For elderly patients, the model achieves an AUC of 0.88 (95% CI: 0.85&#x2013;0.91), and for those with diabetes, it reaches an AUC of 0.90 (95% CI: 0.87&#x2013;0.93). These groups are particularly vulnerable to rapid CKD progression, and the model&#x2019;s high accuracy in predicting their risk enables earlier and more tailored interventions, such as stricter blood pressure control or diabetes management, to slow disease progression. By providing clear, interpretable risk stratification, the model empowers clinicians to make data-driven decisions that improve patient care and quality of life.</p>
</sec>
<sec id="sec16">
<label>4.3</label>
<title>Sensitivity analysis of model performance across renal function decline definitions</title>
<p><xref ref-type="table" rid="tab11">Table 11</xref> presents the sensitivity analysis of the model&#x2019;s performance across various definitions of renal function decline, demonstrating its robustness. The model achieves a high AUC of 0.89 (95% CI: 0.87&#x2013;0.91) for the primary definition (eGFR decline &#x2265;30% or dialysis), with strong sensitivity (0.86) and specificity (0.82). Alternative definitions, such as eGFR decline &#x2265;20%, &#x2265;40%, serum creatinine doubling, and progression to dialysis, yield slightly lower but still robust AUCs (0.86&#x2013;0.88), with sensitivity and specificity ranging from 0.80&#x2013;0.85 and 0.77&#x2013;0.83, respectively. Calibration slopes remain excellent (0.93&#x2013;0.96), indicating consistent alignment between predicted and observed risks. These results confirm the model&#x2019;s stable performance across diverse clinical definitions, enhancing its reliability and applicability for risk stratification in chronic kidney disease management (<xref ref-type="fig" rid="fig10">Figure 10</xref>).</p>
<table-wrap position="float" id="tab11">
<label>Table 11</label>
<caption>
<p>Model performance under different definitions of renal function decline.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Definition</th>
<th align="center" valign="top">AUC (95% CI)</th>
<th align="center" valign="top">Sensitivity (95% CI)</th>
<th align="center" valign="top">Specificity (95% CI)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="middle">eGFR Decline &#x2265;30% or Dialysis</td>
<td align="center" valign="middle">0.89 (0.87&#x2013;0.91)</td>
<td align="center" valign="middle">0.86 (0.83&#x2013;0.89)</td>
<td align="center" valign="middle">0.82 (0.79&#x2013;0.85)</td>
</tr>
<tr>
<td align="left" valign="middle">eGFR Decline &#x2265;20%</td>
<td align="center" valign="middle">0.88 (0.86&#x2013;0.90)</td>
<td align="center" valign="middle">0.85 (0.82&#x2013;0.88)</td>
<td align="center" valign="middle">0.83 (0.80&#x2013;0.86)</td>
</tr>
<tr>
<td align="left" valign="middle">eGFR Decline &#x2265;40%</td>
<td align="center" valign="middle">0.87 (0.85&#x2013;0.89)</td>
<td align="center" valign="middle">0.84 (0.81&#x2013;0.87)</td>
<td align="center" valign="middle">0.81 (0.78&#x2013;0.84)</td>
</tr>
<tr>
<td align="left" valign="middle">Serum Creatinine Doubling</td>
<td align="center" valign="middle">0.86 (0.84&#x2013;0.88)</td>
<td align="center" valign="middle">0.83 (0.80&#x2013;0.86)</td>
<td align="center" valign="middle">0.80 (0.77&#x2013;0.83)</td>
</tr>
<tr>
<td align="left" valign="middle">Progression to Dialysis</td>
<td align="center" valign="middle">0.87 (0.85&#x2013;0.89)</td>
<td align="center" valign="middle">0.85 (0.82&#x2013;0.88)</td>
<td align="center" valign="middle">0.82 (0.79&#x2013;0.85)</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig position="float" id="fig10">
<label>Figure 10</label>
<caption>
<p>Decision curve, subgroup performance and stability analysis.</p>
</caption>
<graphic xlink:href="fmed-12-1598065-g010.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Three-panel image showing: (d) Decision Curve Analysis with curves for Ensemble Model, Random Forest, Treat All, and Treat None, plotting Net Benefit vs. Threshold Probability. (e) Subgroup Analysis with Hazard Ratios and 95% CI for different groups. (f) Stability Analysis showing AUC variation across Iterations, highlighting the mean AUC.</alt-text>
</graphic>
</fig>
</sec>
</sec>
<sec sec-type="conclusions" id="sec17">
<label>5</label>
<title>Conclusion</title>
<p>In this self-contained piece of research, we outline the design and validation of an automated machine learning model for predicting the risk of decline in kidney function, which outperformed the conventional methods. Our ensemble model achieved astounding accuracy (AUC: 0.89, 95% CI: 0.87&#x2013;0.91) in prediction of events, while the calibration of the model remained impressive in diverse populations. The use of several techniques in one novel ensemble framework accompanied by advanced feature selection has provided a solid base for clinical risk prediction in nephrology.</p>
<p>The model clarifies the importance of predictive factors, notably ascribing most eGFR, age, and urinary protein to creatinine ratio, which makes understanding the precise mechanisms of kidney function deterioration easier. Improved understanding, along with the model&#x2019;s predictive performance, enhances the capability of healthcare practitioners to undertake early risk stratification and tailor interventions in a precise manner. The accuracy demonstrated among various patient subgroups and validation cohorts confirms the model&#x2019;s potential value for widespread clinical use.</p>
<p>The results of this study are particularly relevant to the clinical management and future directions of research in nephrology. If this predictive tool is successfully adopted into clinical workflows, it has the potential to revolutionize chronic kidney disease management by allowing for timely and precise interventions and resource assignment. As data-centric decision-making continues to gain traction in healthcare systems, our model serves a robust and practical purpose for predicting the risk of kidney function decline, with the possibility to improve patient care by targeting interventions sooner and more effectively. Future studies need to concentrate on multicenter validation studies and how the model&#x2019;s prediction and clinical application may be augmented through the use of novel biomarkers.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="sec18">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref rid="SM1" ref-type="supplementary-material">Supplementary material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec sec-type="ethics-statement" id="sec19">
<title>Ethics statement</title>
<p>This study was conducted in accordance with the guidelines of the Declaration of Helsinki and was approved by the Ethics Committee of the 95th Hospital of Putian in China RongTong Medical Health Corporation. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation in this study was provided by the participants&#x2019; legal guardians/next of kin. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.</p>
</sec>
<sec sec-type="author-contributions" id="sec20">
<title>Author contributions</title>
<p>HC: Writing &#x2013; original draft, Writing &#x2013; review &#x0026; editing. YH: Writing &#x2013; original draft. LC: Writing &#x2013; original draft.</p>
</sec>
<sec sec-type="funding-information" id="sec21">
<title>Funding</title>
<p>The author(s) declare that no financial support was received for the research and/or publication of this article.</p>
</sec>
<sec sec-type="COI-statement" id="sec22">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="ai-statement" id="sec23">
<title>Generative AI statement</title>
<p>The author(s) declare that no Gen AI was used in the creation of this manuscript.</p>
<p>Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.</p>
</sec>
<sec sec-type="disclaimer" id="sec24">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec sec-type="supplementary-material" id="sec260">
<title>Supplementary material</title>
<p>The Supplementary material for this article can be found online at: <ext-link xlink:href="https://www.frontiersin.org/articles/10.3389/fmed.2025.1598065/full#supplementary-material" ext-link-type="uri">https://www.frontiersin.org/articles/10.3389/fmed.2025.1598065/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="ref1"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bikbov</surname><given-names>B</given-names></name> <name><surname>Purcell</surname><given-names>CA</given-names></name> <name><surname>Levey</surname><given-names>AS</given-names></name> <name><surname>Smith</surname><given-names>M</given-names></name></person-group>. <article-title>Global, regional, and national burden of chronic kidney disease, 1990&#x2013;2017: a systematic analysis for the global burden of disease study 2017</article-title>. <source>Lancet</source>. (<year>2020</year>) <volume>395</volume>:<fpage>709</fpage>&#x2013;<lpage>33</lpage>. doi: <pub-id pub-id-type="doi">10.1016/S0140-6736(20)30045-3</pub-id>, PMID: <pub-id pub-id-type="pmid">32061315</pub-id></citation></ref>
<ref id="ref2"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rajpurkar</surname><given-names>P</given-names></name> <name><surname>Chen</surname><given-names>E</given-names></name> <name><surname>Banerjee</surname><given-names>O</given-names></name></person-group>. <article-title>AI in health and medicine</article-title>. <source>Nat Med</source>. (<year>2022</year>) <volume>28</volume>:<fpage>31</fpage>&#x2013;<lpage>8</lpage>.</citation></ref>
<ref id="ref3"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saleh</surname><given-names>A</given-names></name> <name><surname>Shaban</surname><given-names>NG</given-names></name> <name><surname>Al-Asklany</surname><given-names>HM</given-names></name></person-group>. <article-title>Value of serum kidney injury Molecule-1 in early prediction of kidney injury in patient with ascites and spontaneous bacterial peritonitis</article-title>. <source>Egypt J Hosp Med</source>. (<year>2023</year>) <volume>92</volume>:<fpage>5487</fpage>&#x2013;<lpage>95</lpage>. doi: <pub-id pub-id-type="doi">10.21608/ejhm.2023.305789</pub-id></citation></ref>
<ref id="ref4"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>C</given-names></name> <name><surname>Liu</surname><given-names>J</given-names></name> <name><surname>Fu</surname><given-names>P</given-names></name> <name><surname>Zou</surname><given-names>J</given-names></name></person-group>. <article-title>Artificial intelligence models in diagnosis and treatment of kidney diseases: current status and prospects</article-title>. <source>Kidney Dis</source>. (<year>2025</year>) <volume>11</volume>:<fpage>491</fpage>&#x2013;<lpage>507</lpage>. doi: <pub-id pub-id-type="doi">10.1159/000546397</pub-id>, PMID: <pub-id pub-id-type="pmid">40672962</pub-id></citation></ref>
<ref id="ref5"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gulamali</surname><given-names>FF</given-names></name> <name><surname>Sawant</surname><given-names>AS</given-names></name> <name><surname>Nadkarni</surname><given-names>GN</given-names></name></person-group>. <article-title>Machine learning for risk stratification in kidney disease</article-title>. <source>Curr Opin Nephrol Hypertens</source>. (<year>2022</year>) <volume>31</volume>:<fpage>548</fpage>&#x2013;<lpage>52</lpage>. doi: <pub-id pub-id-type="doi">10.1097/MNH.0000000000000832</pub-id>, PMID: <pub-id pub-id-type="pmid">36004937</pub-id></citation></ref>
<ref id="ref6"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kanda</surname><given-names>E</given-names></name> <name><surname>Epureanu</surname><given-names>BI</given-names></name> <name><surname>Adachi</surname><given-names>T</given-names></name> <name><surname>Kashihara</surname><given-names>N</given-names></name></person-group>. <article-title>Machine-learning-based web system for the prediction of chronic kidney disease progression and mortality</article-title>. <source>PLoS Digit Health</source>. (<year>2023</year>) <volume>2</volume>:<fpage>e0000188</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pdig.0000188</pub-id>, PMID: <pub-id pub-id-type="pmid">36812636</pub-id></citation></ref>
<ref id="ref7"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Islam</surname><given-names>MA</given-names></name> <name><surname>Majumder</surname><given-names>MZH</given-names></name> <name><surname>Hussein</surname><given-names>MA</given-names></name></person-group>. <article-title>Chronic kidney disease prediction based on machine learning algorithms</article-title>. <source>J Pathol Inform</source>. (<year>2023</year>) <volume>14</volume>:<fpage>100189</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jpi.2023.100189</pub-id>, PMID: <pub-id pub-id-type="pmid">36714452</pub-id></citation></ref>
<ref id="ref8"><label>8.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Beissel</surname><given-names>PA</given-names></name></person-group>. <source>Equity Valuation Fresenius Medical Care AG &#x0026; co. KGaA [D]</source>. <publisher-loc>Portugal</publisher-loc>: <publisher-name>Universidade Catolica Portuguesa</publisher-name> (<year>2023</year>).</citation></ref>
<ref id="ref9"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vagliano</surname><given-names>I</given-names></name> <name><surname>Chesnaye</surname><given-names>NC</given-names></name> <name><surname>Leopold</surname><given-names>JH</given-names></name> <name><surname>Jager</surname><given-names>KJ</given-names></name> <name><surname>Abu-Hanna</surname><given-names>A</given-names></name> <name><surname>Schut</surname><given-names>MC</given-names></name></person-group>. <article-title>Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal</article-title>. <source>Clin Kidney J</source>. (<year>2022</year>) <volume>15</volume>:<fpage>2266</fpage>&#x2013;<lpage>80</lpage>. doi: <pub-id pub-id-type="doi">10.1093/ckj/sfac181</pub-id>, PMID: <pub-id pub-id-type="pmid">36381375</pub-id></citation></ref>
<ref id="ref10"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sanmarchi</surname><given-names>F</given-names></name> <name><surname>Fanconi</surname><given-names>C</given-names></name> <name><surname>Golinelli</surname><given-names>D</given-names></name> <name><surname>Gori</surname><given-names>D</given-names></name> <name><surname>Hernandez-Boussard</surname><given-names>T</given-names></name> <name><surname>Capodici</surname><given-names>A</given-names></name></person-group>. <article-title>Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review</article-title>. <source>J Nephrol</source>. (<year>2023</year>) <volume>36</volume>:<fpage>1101</fpage>&#x2013;<lpage>17</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s40620-023-01573-4</pub-id>, PMID: <pub-id pub-id-type="pmid">36786976</pub-id></citation></ref>
<ref id="ref11"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chicco</surname><given-names>D</given-names></name> <name><surname>Warrens</surname><given-names>MJ</given-names></name> <name><surname>Jurman</surname><given-names>G</given-names></name></person-group>. <article-title>The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation</article-title>. <source>Peerj Computer Sci</source>. (<year>2021</year>) <volume>7</volume>:<fpage>e623</fpage>. doi: <pub-id pub-id-type="doi">10.7717/peerj-cs.623</pub-id>, PMID: <pub-id pub-id-type="pmid">34307865</pub-id></citation></ref>
<ref id="ref12"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nelson</surname><given-names>RG</given-names></name> <name><surname>Grams</surname><given-names>ME</given-names></name> <name><surname>Ballew</surname><given-names>SH</given-names></name> <name><surname>Sang</surname><given-names>Y</given-names></name> <name><surname>Azizi</surname><given-names>F</given-names></name> <name><surname>Chadban</surname><given-names>SJ</given-names></name> <etal/></person-group>. <article-title>Development of risk prediction equations for incident chronic kidney disease</article-title>. <source>JAMA</source>. (<year>2019</year>) <volume>322</volume>:<fpage>2104</fpage>&#x2013;<lpage>14</lpage>. doi: <pub-id pub-id-type="doi">10.1001/jama.2019.17379</pub-id>, PMID: <pub-id pub-id-type="pmid">31703124</pub-id></citation></ref>
<ref id="ref13"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tangri</surname><given-names>N</given-names></name> <name><surname>Stevens</surname><given-names>LA</given-names></name> <name><surname>Griffith</surname><given-names>J</given-names></name> <name><surname>Tighiouart</surname><given-names>H</given-names></name> <name><surname>Djurdjev</surname><given-names>O</given-names></name> <name><surname>Naimark</surname><given-names>D</given-names></name> <etal/></person-group>. <article-title>A predictive model for progression of chronic kidney disease to kidney failure</article-title>. <source>JAMA</source>. (<year>2011</year>) <volume>305</volume>:<fpage>1553</fpage>&#x2013;<lpage>9</lpage>. doi: <pub-id pub-id-type="doi">10.1001/jama.2011.451</pub-id>, PMID: <pub-id pub-id-type="pmid">21482743</pub-id></citation></ref>
<ref id="ref14"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prasad</surname><given-names>ML</given-names></name> <name><surname>Kiran</surname><given-names>A</given-names></name> <name><surname>Shaker Reddy</surname><given-names>PC</given-names></name></person-group>. <article-title>Chronic kidney disease risk prediction using machine learning techniques</article-title>. <source>J Inf Technol Manag</source>. (<year>2024</year>) <volume>16</volume>:<fpage>118</fpage>&#x2013;<lpage>34</lpage>. doi: <pub-id pub-id-type="doi">10.22059/jitm.2024.96378</pub-id></citation></ref>
<ref id="ref15"><label>15.</label><citation citation-type="other"><person-group person-group-type="author"><name><surname>Islam</surname><given-names>M. A.</given-names></name> <name><surname>Akter</surname><given-names>S.</given-names></name> <name><surname>Hossen</surname><given-names>M. S.</given-names></name></person-group> &#x201C;Risk factor prediction of chronic kidney disease based on machine learning algorithms&#x201D;; <italic>Proceedings of the 2020 3rd international conference on intelligent sustainable systems (ICISS), F</italic>, (<year>2020</year>). IEEE.</citation></ref>
<ref id="ref16"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kalupukuru</surname><given-names>SR</given-names></name> <name><surname>Natarajan</surname><given-names>K</given-names></name></person-group>. <article-title>Machine learning methods for predicting the prognosis of chronic kidney disease</article-title>. <source>Procedia Comput Sci</source>. (<year>2025</year>) <volume>258</volume>:<fpage>1372</fpage>&#x2013;<lpage>82</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.procs.2025.04.370</pub-id></citation></ref>
<ref id="ref17"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghosh</surname><given-names>SK</given-names></name> <name><surname>Khandoker</surname><given-names>AH</given-names></name></person-group>. <article-title>Investigation on explainable machine learning models to predict chronic kidney diseases</article-title>. <source>Sci Rep</source>. (<year>2024</year>) <volume>14</volume>:<fpage>3687</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-024-54375-4</pub-id>, PMID: <pub-id pub-id-type="pmid">38355876</pub-id></citation></ref>
<ref id="ref18"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bai</surname><given-names>Q</given-names></name> <name><surname>Su</surname><given-names>C</given-names></name> <name><surname>Tang</surname><given-names>W</given-names></name> <name><surname>Li</surname><given-names>Y</given-names></name></person-group>. <article-title>Machine learning to predict end stage kidney disease in chronic kidney disease</article-title>. <source>Sci Rep</source>. (<year>2022</year>) <volume>12</volume>:<fpage>8377</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-022-12316-z</pub-id>, PMID: <pub-id pub-id-type="pmid">35589908</pub-id></citation></ref>
<ref id="ref19"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lei</surname><given-names>N</given-names></name> <name><surname>Zhang</surname><given-names>X</given-names></name> <name><surname>Wei</surname><given-names>M</given-names></name> <name><surname>Lao</surname><given-names>B</given-names></name> <name><surname>Xu</surname><given-names>X</given-names></name> <name><surname>Zhang</surname><given-names>M</given-names></name> <etal/></person-group>. <article-title>Machine learning algorithms&#x2019; accuracy in predicting kidney disease progression: a systematic review and meta-analysis</article-title>. <source>BMC Med Inform Decis Mak</source>. (<year>2022</year>) <volume>22</volume>:<fpage>205</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12911-022-01951-1</pub-id>, PMID: <pub-id pub-id-type="pmid">35915457</pub-id></citation></ref>
<ref id="ref20"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Figueroa</surname><given-names>RL</given-names></name> <name><surname>Zeng-Treitler</surname><given-names>Q</given-names></name> <name><surname>Kandula</surname><given-names>S</given-names></name></person-group>. <article-title>Predicting sample size required for classification performance</article-title>. <source>BMC Med Inform Decis Mak</source>. (<year>2012</year>) <volume>12</volume>:<fpage>8</fpage>.</citation></ref>
<ref id="ref21"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chan</surname><given-names>L</given-names></name> <name><surname>Nadkarni</surname><given-names>GN</given-names></name> <name><surname>Fleming</surname><given-names>F</given-names></name> <name><surname>McCullough</surname><given-names>JR</given-names></name> <name><surname>Connolly</surname><given-names>P</given-names></name> <name><surname>Mosoyan</surname><given-names>G</given-names></name> <etal/></person-group>. <article-title>Derivation and validation of a machine learning risk score using biomarker and electronic patient data to predict progression of diabetic kidney disease</article-title>. <source>Diabetologia</source>. (<year>2021</year>) <volume>64</volume>:<fpage>1504</fpage>&#x2013;<lpage>15</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00125-021-05444-0</pub-id>, PMID: <pub-id pub-id-type="pmid">33797560</pub-id></citation></ref>
<ref id="ref22"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname><given-names>J</given-names></name> <name><surname>Ding</surname><given-names>R</given-names></name> <name><surname>Xu</surname><given-names>X</given-names></name></person-group>. <article-title>Comparison and development of machine learning tools in the prediction of chronic kidney disease progression</article-title>. <source>J Transl Med</source>. (<year>2019</year>) <volume>17</volume>:<fpage>119</fpage>.</citation></ref>
<ref id="ref23"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferguson</surname><given-names>T</given-names></name> <name><surname>Ravani</surname><given-names>P</given-names></name> <name><surname>Sood</surname><given-names>MM</given-names></name> <name><surname>Clarke</surname><given-names>A</given-names></name> <name><surname>Komenda</surname><given-names>P</given-names></name> <name><surname>Rigatto</surname><given-names>C</given-names></name> <etal/></person-group>. <article-title>Development and external validation of a machine learning model for progression of CKD</article-title>. <source>Kidney Int Rep</source>. (<year>2022</year>) <volume>7</volume>:<fpage>1772</fpage>&#x2013;<lpage>81</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ekir.2022.05.004</pub-id>, PMID: <pub-id pub-id-type="pmid">35967110</pub-id></citation></ref>
<ref id="ref24"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iftikhar</surname><given-names>H</given-names></name> <name><surname>Khan</surname><given-names>M</given-names></name> <name><surname>Khan</surname><given-names>Z</given-names></name> <name><surname>Khan</surname><given-names>F</given-names></name> <name><surname>Alshanbari</surname><given-names>HM</given-names></name> <name><surname>Ahmad</surname><given-names>Z</given-names></name></person-group>. <article-title>A comparative analysis of machine learning models: a case study in predicting chronic kidney disease</article-title>. <source>Sustainability</source>. (<year>2023</year>) <volume>15</volume>:<fpage>2754</fpage>. doi: <pub-id pub-id-type="doi">10.3390/su15032754</pub-id></citation></ref>
<ref id="ref25"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zacharias</surname><given-names>HU</given-names></name> <name><surname>Altenbuchinger</surname><given-names>M</given-names></name> <name><surname>Schultheiss</surname><given-names>UT</given-names></name></person-group>. <article-title>A predictive model for progression of CKD to kidney failure based on routine laboratory tests</article-title>. <source>Am J Kidney Dis</source>. (<year>2022</year>) <volume>79</volume>:<fpage>e211</fpage>:<fpage>217</fpage>&#x2013;<lpage>30</lpage>.</citation></ref>
<ref id="ref26"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Su</surname><given-names>C-T</given-names></name> <name><surname>Chang</surname><given-names>Y-P</given-names></name> <name><surname>Ku</surname><given-names>Y-T</given-names></name> <name><surname>Lin</surname><given-names>C-M</given-names></name></person-group>. <article-title>Machine learning models for the prediction of renal failure in chronic kidney disease: a retrospective cohort study</article-title>. <source>Diagnostics</source>. (<year>2022</year>) <volume>12</volume>:<fpage>2454</fpage>. doi: <pub-id pub-id-type="doi">10.3390/diagnostics12102454</pub-id>, PMID: <pub-id pub-id-type="pmid">36292142</pub-id></citation></ref>
<ref id="ref27"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Orhan</surname><given-names>A</given-names></name> <name><surname>Akbayrak</surname><given-names>H</given-names></name> <name><surname>&#x00C7;i&#x00E7;ek</surname><given-names>&#x00D6;F</given-names></name> <name><surname>Harmankaya</surname><given-names>&#x0130;</given-names></name> <name><surname>Vatansev</surname><given-names>H</given-names></name></person-group>. <article-title>A user-friendly machine learning approach for cardiac structures assessment</article-title>. <source>Front Cardiovasc Med</source>. (<year>2024</year>) <volume>11</volume>:<fpage>1426888</fpage>. doi: <pub-id pub-id-type="doi">10.3389/fcvm.2024.1426888</pub-id>, PMID: <pub-id pub-id-type="pmid">39036503</pub-id></citation></ref>
<ref id="ref28"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bellocchio</surname><given-names>F</given-names></name> <name><surname>Lonati</surname><given-names>C</given-names></name> <name><surname>Ion</surname><given-names>J</given-names></name></person-group>. <article-title>Validation of a novel predictive algorithm for kidney failure in patients suffering from chronic kidney disease: The Prognostic Reasoning System for Chronic Kidney Disease (PROGRES-CKD)</article-title>. <source>Int J Environ Res Public Health</source>. (<year>2021</year>) <volume>18</volume>:12649. doi: <pub-id pub-id-type="doi">10.3390/ijerph182312649</pub-id>, PMID: <pub-id pub-id-type="pmid">34886378</pub-id></citation></ref>
<ref id="ref29"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vastrad</surname><given-names>B</given-names></name> <name><surname>Vastrad</surname><given-names>C</given-names></name></person-group>. <article-title>Identification of candidate biomarkers and signaling pathways associated with Alzheimer&#x2019;s disease using bioinformatics analysis of next generation sequencing data</article-title>. <source>bioRxiv</source>. (<year>2012</year>) <volume>2024</volume>:<fpage>626535</fpage>.</citation></ref>
<ref id="ref30"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Churpek</surname><given-names>MM</given-names></name> <name><surname>Carey</surname><given-names>KA</given-names></name> <name><surname>Edelson</surname><given-names>DP</given-names></name> <name><surname>Singh</surname><given-names>T</given-names></name> <name><surname>Astor</surname><given-names>BC</given-names></name> <name><surname>Gilbert</surname><given-names>ER</given-names></name> <etal/></person-group>. <article-title>Internal and external validation of a machine learning risk score for acute kidney injury</article-title>. <source>JAMA Netw Open</source>. (<year>2020</year>) <volume>3</volume>:<fpage>e2012892</fpage>. doi: <pub-id pub-id-type="doi">10.1001/jamanetworkopen.2020.12892</pub-id>, PMID: <pub-id pub-id-type="pmid">32780123</pub-id></citation></ref>
<ref id="ref31"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Segal</surname><given-names>Z</given-names></name> <name><surname>Kalifa</surname><given-names>D</given-names></name> <name><surname>Radinsky</surname><given-names>K</given-names></name> <name><surname>Ehrenberg</surname><given-names>B</given-names></name> <name><surname>Elad</surname><given-names>G</given-names></name> <name><surname>Maor</surname><given-names>G</given-names></name> <etal/></person-group>. <article-title>Machine learning algorithm for early detection of end-stage renal disease</article-title>. <source>BMC Nephrol</source>. (<year>2020</year>) <volume>21</volume>:<fpage>518</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12882-020-02093-0</pub-id>, PMID: <pub-id pub-id-type="pmid">33246427</pub-id></citation></ref>
<ref id="ref32"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Makino</surname><given-names>M</given-names></name> <name><surname>Yoshimoto</surname><given-names>R</given-names></name> <name><surname>Ono</surname><given-names>M</given-names></name> <name><surname>Itoko</surname><given-names>T</given-names></name> <name><surname>Katsuki</surname><given-names>T</given-names></name> <name><surname>Koseki</surname><given-names>A</given-names></name> <etal/></person-group>. <article-title>Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning</article-title>. <source>Sci Rep</source>. (<year>2019</year>) <volume>9</volume>:<fpage>11862</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-019-48263-5</pub-id>, PMID: <pub-id pub-id-type="pmid">31413285</pub-id></citation></ref>
<ref id="ref33"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ekundayo</surname><given-names>F</given-names></name></person-group>. <article-title>Machine learning for chronic kidney disease progression modelling: leveraging data science to optimize patient management</article-title>. <source>World J Adv Res Rev</source>. (<year>2024</year>) <volume>24</volume>:<fpage>453</fpage>&#x2013;<lpage>75</lpage>.</citation></ref>
<ref id="ref34"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Delrue</surname><given-names>C</given-names></name> <name><surname>De Bruyne</surname><given-names>S</given-names></name> <name><surname>Speeckaert</surname><given-names>MM</given-names></name></person-group>. <article-title>Application of machine learning in chronic kidney disease: current status and future prospects</article-title>. <source>Biomedicine</source>. (<year>2024</year>) <volume>12</volume>:<fpage>568</fpage>. doi: <pub-id pub-id-type="doi">10.3390/biomedicines12030568</pub-id>, PMID: <pub-id pub-id-type="pmid">38540181</pub-id></citation></ref>
<ref id="ref35"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gogoi</surname><given-names>P</given-names></name> <name><surname>Valan</surname><given-names>JA</given-names></name></person-group>. <article-title>Machine learning approaches for predicting and diagnosing chronic kidney disease: current trends, challenges, solutions, and future directions</article-title>. <source>Int Urol Nephrol</source>. (<year>2025</year>) <volume>57</volume>:<fpage>1245</fpage>&#x2013;<lpage>68</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11255-024-04281-5</pub-id>, PMID: <pub-id pub-id-type="pmid">39560857</pub-id></citation></ref>
<ref id="ref36"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname><given-names>YC</given-names></name> <name><surname>Cha</surname><given-names>J</given-names></name> <name><surname>Shim</surname><given-names>I</given-names></name> <name><surname>Park</surname><given-names>WY</given-names></name> <name><surname>Kang</surname><given-names>SW</given-names></name> <name><surname>Lim</surname><given-names>DH</given-names></name> <etal/></person-group>. <article-title>Multimodal deep learning of fundus abnormalities and traditional risk factors for cardiovascular risk prediction</article-title>. <source>NPJ Digit Med</source>. (<year>2023</year>) <volume>6</volume>:<fpage>14</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41746-023-00748-4</pub-id>, PMID: <pub-id pub-id-type="pmid">36732671</pub-id></citation></ref>
<ref id="ref37"><label>37.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>He</surname><given-names>Y</given-names></name> <name><surname>Shen</surname><given-names>Z</given-names></name> <name><surname>Cui</surname><given-names>P</given-names></name></person-group>. <article-title>Towards non-iid image classification: a dataset and baselines</article-title>. <source>Pattern Recogn</source>. (<year>2021</year>) <volume>110</volume>:<fpage>107383</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.patcog.2020.107383</pub-id></citation></ref>
<ref id="ref38"><label>38.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kernbach</surname><given-names>JM</given-names></name> <name><surname>Staartjes</surname><given-names>VE</given-names></name></person-group>. <article-title>Foundations of machine learning-based clinical prediction modeling: part II&#x2014;generalization and overfitting</article-title>. <source>Machine Learn Clin Neurosci: Foundations Applications</source>. (<year>2021</year>):<fpage>15</fpage>&#x2013;<lpage>21</lpage>.</citation></ref>
<ref id="ref39"><label>39.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Krittanawong</surname><given-names>C</given-names></name></person-group>. <source>Artificial intelligence in clinical practice: How AI technologies impact medical research and clinics</source>. San Diego: <publisher-name>Elsevier</publisher-name> (<year>2023</year>).</citation></ref>
<ref id="ref40"><label>40.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ojo</surname><given-names>B</given-names></name> <name><surname>Campbell</surname><given-names>CH</given-names></name></person-group>. <article-title>Perioperative acute kidney injury: impact and recent update</article-title>. <source>Curr Opin Anaesthesiol</source>. (<year>2022</year>) <volume>35</volume>:<fpage>215</fpage>&#x2013;<lpage>23</lpage>. doi: <pub-id pub-id-type="doi">10.1097/ACO.0000000000001104</pub-id>, PMID: <pub-id pub-id-type="pmid">35102042</pub-id></citation></ref>
<ref id="ref41"><label>41.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname><given-names>H-T</given-names></name> <name><surname>Cheon</surname><given-names>H-R</given-names></name> <name><surname>Lee</surname><given-names>S-H</given-names></name> <name><surname>Shim</surname><given-names>M</given-names></name> <name><surname>Hwang</surname><given-names>H-J</given-names></name></person-group>. <article-title>Risk of data leakage in estimating the diagnostic performance of a deep-learning-based computer-aided system for psychiatric disorders</article-title>. <source>Sci Rep</source>. (<year>2023</year>) <volume>13</volume>:<fpage>16633</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-023-43542-8</pub-id>, PMID: <pub-id pub-id-type="pmid">37789047</pub-id></citation></ref>
<ref id="ref42"><label>42.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gygi</surname><given-names>JP</given-names></name> <name><surname>Kleinstein</surname><given-names>SH</given-names></name> <name><surname>Guan</surname><given-names>L</given-names></name></person-group>. <article-title>Predictive overfitting in immunological applications: pitfalls and solutions</article-title>. <source>Hum Vaccin Immunother</source>. (<year>2023</year>) <volume>19</volume>:<fpage>2251830</fpage>. doi: <pub-id pub-id-type="doi">10.1080/21645515.2023.2251830</pub-id>, PMID: <pub-id pub-id-type="pmid">37697867</pub-id></citation></ref>
<ref id="ref43"><label>43.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname><given-names>P-r</given-names></name> <name><surname>Lu</surname><given-names>L</given-names></name> <name><surname>Zhang</surname><given-names>J-y</given-names></name> <name><surname>Huo</surname><given-names>TT</given-names></name> <name><surname>Liu</surname><given-names>SX</given-names></name> <name><surname>Ye</surname><given-names>ZW</given-names></name></person-group>. <article-title>Application of artificial intelligence in medicine: an overview. Current</article-title>. <source>Med Sci</source>. (<year>2021</year>) <volume>41</volume>:<fpage>1105</fpage>&#x2013;<lpage>15</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11596-021-2474-3</pub-id>, PMID: <pub-id pub-id-type="pmid">34874486</pub-id></citation></ref>
<ref id="ref44"><label>44.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>&#x00C7;i&#x00E7;ek</surname><given-names>&#x00D6;F</given-names></name> <name><surname>Aky&#x00FC;rek</surname><given-names>F</given-names></name> <name><surname>Akbayrak</surname><given-names>H</given-names></name> <name><surname>Orhan</surname><given-names>A</given-names></name> <name><surname>Kaya</surname><given-names>EC</given-names></name> <name><surname>B&#x00FC;y&#x00FC;kate&#x015F;</surname><given-names>M</given-names></name></person-group>. <article-title>Can preoperative neopterin levels predict acute kidney injury in patients undergoing on-pump cardiac surgery?</article-title> <source>Turk J Biochem</source>. (<year>2023</year>) <volume>48</volume>:<fpage>531</fpage>&#x2013;<lpage>40</lpage>. doi: <pub-id pub-id-type="doi">10.1515/tjb-2023-0074</pub-id></citation></ref>
<ref id="ref45"><label>45.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Okita</surname><given-names>J</given-names></name> <name><surname>Nakata</surname><given-names>T</given-names></name> <name><surname>Uchida</surname><given-names>H</given-names></name> <name><surname>Kudo</surname><given-names>A</given-names></name> <name><surname>Fukuda</surname><given-names>A</given-names></name> <name><surname>Ueno</surname><given-names>T</given-names></name> <etal/></person-group>. <article-title>Development and validation of a machine learning model to predict time to renal replacement therapy in patients with chronic kidney disease</article-title>. <source>BMC Nephrol</source>. (<year>2024</year>) <volume>25</volume>:<fpage>101</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s12882-024-03527-9</pub-id>, PMID: <pub-id pub-id-type="pmid">38493099</pub-id></citation></ref>
</ref-list>
</back>
</article>