AUTHOR=Jorquera Ricardo , Droppelmann Guillermo , Dollmann Max , Blanco Gonzalo , Ahumada Ignacio , Lira Alfonso , Feijoo Felipe TITLE=Mining the risk: early cardiovascular detection in workers JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1678172 DOI=10.3389/fmed.2025.1678172 ISSN=2296-858X ABSTRACT=BackgroundCardiovascular disease (CVD) is the leading cause of death worldwide. Although tools exist to assess individual cardiovascular risk (CVR), they often fall short in unique populations such as miners, who work under extreme conditions. To address these limitations, this study proposes the use of machine learning (ML) and longitudinal data to predict risk progression using accessible clinical markers. Body mass index (BMI) and blood glucose (BG) were chosen as key CVR proxies because they are affordable, measured routinely in occupational health checks, and responsive to metabolic stresses common in mining environments.MethodsWe conducted a retrospective longitudinal analysis of 89,045 Chilean mining workers (420,966 preemployment exams; 2021–2024). For each worker, we formed successive visit pairs to model transitions between clinically defined BMI and BG categories. Four binary outcomes based on the scenario per biomarker were specified (any upward transition; adjacent upward transition; obesity–morbid obesity/prediabetes–diabetes; any transition ending in morbid obesity/diabetes). Machine learning techniques were built to assess transitions for each scenario and biomarker. We applied a stratified 70/30 train–test split, repeated 7-fold cross-validation within training, random hyperparameter search (AUC objective), and downsampling of the majority classes within folds to address the imbalance. Performance in the original (imbalanced) test set was summarized by AUC, accuracy, sensitivity, and specificity with 95% CIs of the cross-validation process. The correlation between models was assessed using Pearson's correlations of predicted probabilities.ResultsPredicting BMI transitions (N = 18,035 pairs) was highly accurate between models. The best performance occurred for severe progression (Scenario 4, defined as any transition ending in morbid obesity): where XGB achieved AUC 0.95 and accuracy 0.91, with high sensitivity and strong specificity. For broader BMI transitions across scenarios 1–3, models remained reliable AUC 0.84–0.87. BG transitions (N = 16,161 pairs) were harder but still actionable. The strongest results were for progression to diabetes (Scenario 4), with RF reaching AUC 0.83 (95% CI: 0.82–0.90) and accuracy 0.76; other BG scenarios yielded AUC 0.71–0.77. Cross-validation closely matched test performance. Pairwise probability correlations were typically >0.90 for BMI and >0.80 for BG in severe scenarios, indicating good generalization and no evidence of overfitting.ConclusionML models effectively predict clinically relevant BMI and BG risk transitions in the extraction of occupational health data. The use of longitudinal visit pairs and scenario-based evaluation improves the capacity of the models to achieve high AUC values and maintain accuracy and sensitivity, while ensuring generalization and consistency. These findings highlight the potential of this approach to improve the assessment of CVR and support preventive decision-making in high-risk working populations.