AUTHOR=Li Zishan , Wang Lei , Zhang Xunying , Wu Aiping , Liu Tao TITLE=Integration of machine learning and large language models for screening and identifying key risk factors of acute kidney injury after cardiac surgery JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1618222 DOI=10.3389/fmed.2025.1618222 ISSN=2296-858X ABSTRACT=ObjectivesThis study aimed to identify critical risk factors for acute kidney injury (AKI) following cardiac surgery. By integrating patient data from the MIMIC-IV database with large language models (LLMs) and machine learning algorithms, we ensured the clinical relevance of the selected risk factors, providing robust insights for the early identification and intervention of postoperative AKI.MethodsIntensive care unit (ICU) data of patients from the MIMIC-IV database undergoing cardiac surgery were analyzed. Lasso regression and random forest algorithms were used to select significant predictive features from high-dimensional data. Model evaluation involved 10-fold cross-validation and metrics including accuracy, sensitivity, specificity, and the area under the curve. To enhance clinical relevance, LLMs-simulated expert judgment in cardiology and nephrology, which was further validated through discussions with clinical experts.ResultsIn the cohort consisting of 4,565 patients, a total of 113 important and shared risk factors for AKI were identified, including variables such as anion gap, arterial partial pressure of oxygen (PaO2), and fraction of inspired oxygen (FiO2). Among these, 18 key variables were identified as postoperative AKI predictors via machine learning and LLMs-simulated expert validation. These included anchor age, Creatinine (serum), BUN (Blood Urea Nitrogen), Potassium (serum), Sodium (serum), Lactic Acid, Troponin-T, Furosemide (Lasix), Vancomycin (Random), Gentamicin (Trough), Albumin 5%, ART BP Mean, Cardiac Output (thermodilution), Brain Natriuretic Peptide (BNP), Absolute Count - Lymphs, Absolute Count - Monos, and Absolute Count - Neuts. The integration of LLMs with machine learning algorithms proved effective in accurately identifying clinically relevant risk factors.ConclusionThe proposed risk prediction approach for postoperative AKI following cardiac surgery, based on the collaborative analysis of machine learning and large language models (LLMs), effectively identified and validated key clinical risk factors. By simulating expert clinical reasoning, the LLMs significantly enhanced the medical relevance of feature selection and improved the clinical interpretability of the model. This approach provides a solid theoretical and practical foundation for the precise early identification and clinical intervention of postoperative AKI in cardiac surgery patients.