AUTHOR=Yu Qiu-Yan , Lin Ying , Zhou Yu-Run , Yang Xin-Jun , Hemelaar Joris TITLE=Antenatal prediction of small for gestational age at birth based on four birthweight standards using machine learning algorithms JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2026 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1679979 DOI=10.3389/frai.2025.1679979 ISSN=2624-8212 ABSTRACT=BackgroundAccurate antenatal prediction of SGA at birth is essential to improve development and delivery of preventative and therapeutic interventions. This study aimed to assess the performance of machine learning (ML) models to predict SGA at birth among Chinese pregnancies classified according to the Chinese birthweight standard and three international birthweight standards.MethodsWe collected multimodal, longitudinal, antenatal surveillance data on 350,135 singleton pregnancies in Wenzhou City, China, between Jan 1, 2014 and Dec 31, 2016. For three pregnancy intervals we developed ML prediction models for newborns classified as SGA using the China, Intergrowth 21st, Fetal Medicine Foundation (FMF), and Gestation-related Optimal Weight (GROW) standards. We applied lasso regression to conduct feature selection, and CatBoost, XGBoost, LightBoost, Artificial Neural Networks, Random Forest, Stacked ensemble model, and logistic regression for predictive modeling in training data sets, with validation in testing data sets.ResultsAmong 22,603 singleton pregnancies with complete data, the rate of SGA using the China standard was 6.1%, compared to 4.3, 6.0, and 9.7% for the Intergrowth 21st, GROW, and FMF standards, respectively. This pattern was maintained in the imputed data set (n = 225,523), with corresponding SGA rates of 6.8, 4.8, 7.4, and 10.7%. Late pregnancy models (<37 weeks) had the best power to predict SGA, compared to middle (<26 weeks) and early pregnancy (<18 weeks) models. With the China standard, the logistic regression model in late pregnancy performed best with an area under the receiver operating characteristic curve (ROC-AUC) of 0.74. Logistic regression also performed better than ML algorithms with the Intergrowth-21st and GROW standards at each pregnancy interval, although differences were small. The Random Forest model with the FMF standard achieved superior performance at each pregnancy interval, reaching a ROC-AUC of 0.79 in late pregnancy. Notably, the middle pregnancy Random Forest model with the FMF standard already attained a ROC-AUC of 0.72 at 26 weeks’ gestation. Symphysis-fundal height, maternal abdominal circumference, maternal age, maternal height and weight, and parity were consistently identified as key predictors of SGA across the different standards.ConclusionThere are important differences in the classification of SGA at birth between national and international birthweight standards. Both machine learning models and traditional logistic regression demonstrated comparable predictive performance for SGA identification. These findings hold promise for guiding risk-stratified prenatal care and optimizing resource allocation in clinical settings.