AUTHOR=Stephen Adebayo Osasan , Oche Ambrose George , Olusola Daramola , Oluwafemi Adefolalu , Alzahrani Hind A. , Hasan Abdulkarim TITLE=QSAR-guided discovery of novel KRAS inhibitors for lung cancer therapy JOURNAL=Frontiers in Bioinformatics VOLUME=Volume 5 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2025.1663846 DOI=10.3389/fbinf.2025.1663846 ISSN=2673-7647 ABSTRACT=IntroductionKRAS mutations are key oncogenic drivers in lung cancer, yet effective pharmacological targeting has remained a major challenge due to the protein's elusive and dynamic binding pockets. Computational modeling offers a promising route to identify novel inhibitors with improved potency and selectivity.MethodsA quantitative structure–activity relationship (QSAR) modeling approach was developed to predict the inhibitory potency (pIC50) of KRAS inhibitors and support de novo drug design. Molecular descriptors for 62 inhibitors retrieved from the ChEMBL database (CHEMBL4354832) were computed using Chemopy. Following descriptor normalization and dimensionality reduction, five machine learning algorithm spartial least squares (PLS), random forest (RF), stepwise multiple linear regression (MLR), genetic algorithm optimized MLR (GA-MLR), and XGBoost were applied. Model performance was evaluated using R2, RMSE, and MAE, while permutation-based importance and SHAP analyses provided feature interpretability.ResultsAmong the models tested, PLS exhibited the best predictive performance (R2 = 0.851; RMSE = 0.292), followed by RF (R2 = 0.796). The GA-MLR model, based on eight optimized molecular descriptors, achieved good interpretability and robust internal validation (R2 = 0.677). Virtual screening of 56 de novo designed compounds within the model's applicability domain identified compound C9 with a predicted pIC50) of 8.11 as the most promising hit.DiscussionThis integrative QSAR modeling and de novo design framework effectively predicted the bioactivity of KRAS inhibitors and facilitated the identification of novel candidate molecules. The findings demonstrate the utility of combining interpretable machine learning models with virtual screening to accelerate the discovery of potent KRAS inhibitors for lung cancer therapy.