AUTHOR=Choo Winston , Raghavan Shreyaa TITLE=Predicting methane emissions in smallholder dairy systems: a clustering and ensemble learning approach JOURNAL=Frontiers in Sustainable Food Systems VOLUME=Volume 9 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/sustainable-food-systems/articles/10.3389/fsufs.2025.1668517 DOI=10.3389/fsufs.2025.1668517 ISSN=2571-581X ABSTRACT=Methane (CH4) is the second most prevalent anthropogenic greenhouse gas and a major driver of climate change. In Indonesia, smallholder dairy farms significantly contribute to national CH4 emissions, primarily through enteric fermentation and manure management. However, these farms often lack access to effective tools for monitoring and mitigating emissions. This study introduces a machine learning based framework to predict CH4 emissions from 32 smallholder dairy farms in Lembang, Indonesia. The farms were first clustered using K-means, to find groups of similar farm types. Then, different models were built to predict future CH4 emissions for each cluster by testing six approaches: linear regression, polynomial regression, Random Forest, XGBoost, SVR and ARIMA. Stacked ensemble models–using unclustered, clustered and a hybrid mix of base predictions–were then developed to integrate the strengths of each approach. Performance was evaluated using both time-based train-test splits and cross validation to test for real world deployment and generalizability to other farms. The hybrid stacked model outperformed unclustered individual models in cross validation evaluation, achieving high accuracy across all emission types—enteric, manure, and total. Confidence and prediction interval analyses further confirmed its stability in predictive behavior, independent of measurement uncertainty. Overall, the proposed hybrid ensemble–clustering framework demonstrates the feasibility of machine learning–based CH4 forecasting in smallholder dairy systems, with implications for targeted mitigation and climate-smart policy planning.