AUTHOR=Yu Mingming , Zhang Yanxi TITLE=Comparative models on low multiplier DRG classification for advanced lung cancer JOURNAL=Frontiers in Public Health VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2025.1614938 DOI=10.3389/fpubh.2025.1614938 ISSN=2296-2565 ABSTRACT=ObjectiveThis study aimed to compare the performance of machine learning models in predicting low multiplier DRGs for advanced lung cancer, and to identify the optimal algorithm along with key influencing factors.MethodsPrediction models for low multiplier DRGs in advanced lung cancer were developed using four machine learning algorithms: logistic regression, hybrid naive Bayes, support vector machine (SVM), and random forest. Model performance was evaluated, and key contributing features were identified.ResultsThe random forest algorithm achieved the highest AUC, accuracy, and precision across all three ER group, indicating robust performance. Second, cost-related features and length of hospital stay (LoS) reflecting “resource consumption” contributed significantly more to the low multiplier DRGs prediction than demographic factors such as gender and age.ConclusionBased on comorbidity severity, the DRG classification for advanced lung cancer patients receiving internal medicine treatment under ER1 appeared reasonably structured and provided a valid basis for subgroup comparisons. Additionally, according to the predictive model’s findings, potential signs of upcoding and intentional underuse of reimbursable medications were observed, highlighting the need to monitor examination fee reductions across ER1 subgroups and to track medication costs in ER11 throughout the hospital stay. Lastly, in predicting low multiplier DRGs, larger datasets improve model stability. Model choice should align with the analytical goal: Random Forest offers higher precision and robustness, while logistic regression or SVM may be preferred for higher recall.