AUTHOR=Alazmi Meshari TITLE=KmPred: prediction of Michaelis constants (Km) using an integrative machine learning framework JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 9 - 2026 YEAR=2026 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2026.1711471 DOI=10.3389/frai.2026.1711471 ISSN=2624-8212 ABSTRACT=Background and motivationThe Michaelis constant Km is one of the key kinetic parameters in the quantification of enzyme-substrate affinity within the context of the Michaelis–Menten theory. While Km values are traditionally subjected to labor-intensive governance via in vitro assays, the brisk expansion of protein sequence and chemical databases has composed an essential intended for computational prediction approaches.MethodologyHerein, we expose a consolidative machine learning framework-KmPred-for Km forecast that merges protein sequence embeddings from state-of-the-art language models with molecular descriptors derived from substrate SMILES descriptions. This methodology was benchmarked on the MPEK dataset and the independent dataset assembled by Kroll et al.Results and discussionOn the MPEK dataset, the greatest model achieved a test MSE of 0.4995, RMSE of 0.7067, MAE of 0.5022, R2 of 0.7049, and a PCC of 0.8398 (p < 1 × 10−6), outperforming the baseline MPEK model. On the Kroll dataset, KmPred achieved a test MSE of 0.6206, RMSE of 0.7878, R2 of 0.5519, PCC of 0.7440, and Spearman’s ρ of 0.7342, which represents reasonable results compared to state-of-the-art methods. These outcomes demonstrate that combining multi-modal protein sequence and ligand features with advanced machine learning architectures enables robust and generalizable Km prediction across diverse datasets. Specifically, we utilized LSTM and Transformer models solely for feature extraction to capture complex sequential and contextual patterns from enzyme sequences, while employing XGBoost as our primary regression model for final Km predictions. Beyond methodological impact, this work highlights the role of AI-driven kinetic modeling in accelerating enzyme characterization, facilitating metabolic engineering, and enhancing drug discovery pipelines. Our approach thus establishes a foundation for predictive enzymology at scale, with significant potential to benefit biotechnology, synthetic biology, and national strategic initiatives such as Saudi Vision 2030.