AUTHOR=Zhang Chenming , Niu Bin , Wang Rong , Zhang Liaoyun TITLE=From traditional metabolic markers to ensemble learning: comparative application of machine learning models for predicting NAFLD risk in adolescents JOURNAL=Frontiers in Endocrinology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/endocrinology/articles/10.3389/fendo.2025.1681686 DOI=10.3389/fendo.2025.1681686 ISSN=1664-2392 ABSTRACT=BackgroundNon-alcoholic fatty liver disease (NAFLD) is increasingly prevalent among adolescents and poses a significant public health challenge. Due to limitations in imaging and invasive diagnostic methods such as liver biopsy, there is a pressing need for accurate, cost-effective, and non-invasive risk prediction tools. This study aims to develop and compare multiple machine learning (ML) models to predict NAFLD risk in adolescents using routine anthropometric and laboratory data from the National Health and Nutrition Examination Survey (NHANES) 2011–2020 dataset.MethodsData from 2,132 U.S. adolescents (NHANES 2011–2020) were analyzed. Nine machine learning (ML) models were developed using features selected by Light Gradient Boosting Machine (LightGBM). Performance was assessed by AUC, accuracy, sensitivity, precision, F1-score, and calibration. The Extra Trees (ET) model was further compared with TyG-based logistic regression models. Model interpretability was evaluated using SHapley Additive exPlanations (SHAP), and an interactive online prediction tool was deployed.ResultsNAFLD prevalence was 13.0%. The ET model achieved the best overall performance (AUC = 0.784, ACC = 0.773, Kappa = 0.320), outperforming other ML algorithms and TyG-based models, which showed higher sensitivity but poorer precision. SHAP analysis identified waist circumference, triglycerides, insulin, and HDL as key predictors, revealing nonlinear threshold effects. The online tool allows individualized risk estimation based on routine clinical variables.ConclusionThe ET-based ML model provides an accurate and interpretable approach for adolescent NAFLD risk prediction. By surpassing traditional metabolic indicators and offering an accessible web-based calculator, it supports scalable, cost-effective early screening and targeted prevention strategies.