AUTHOR=Jang MyeongGyun , Kim DongOk , Yoon Sujung , Lee Hwamin TITLE=Predicting alcohol use disorder risk in firefighters using a multimodal deep learning model: a cross-sectional study JOURNAL=Frontiers in Psychiatry VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2025.1643552 DOI=10.3389/fpsyt.2025.1643552 ISSN=1664-0640 ABSTRACT=IntroductionFirefighters constitute a high-risk occupational cohort for alcohol use disorder (AUD) due to chronic trauma exposure, yet traditional screening methodologies relying on self-report instruments remain compromised by systematic underreporting attributable to occupational stigma and career preservation concerns. This cross-sectional investigation developed and validated a multimodal deep learning framework integrating T1-weighted structural magnetic resonance imaging with standardized neuropsychological assessments to enable objective AUD risk stratification without necessitating computationally intensive functional neuroimaging protocols.MethodsAnalysis of 689 active-duty firefighters (mean age 43.3±8.8 years; 93% male) from a nationwide occupational cohort incorporated high-resolution three-dimensional T1-weighted structural MRI acquisition alongside comprehensive neuropsychological evaluation utilizing the Grooved Pegboard Test for visual-motor coordination assessment and Trail Making Test for executive function quantification. The novel computational architecture synergistically combined ResNet-50 convolutional neural networks for hierarchical morphological feature extraction, Vision Transformer modules for global neuroanatomical pattern recognition, and multilayer perceptron integration of clinical variables, with model interpretability assessed through Gradient-weighted Class Activation Mapping and SHapley Additive exPlanations methodologies. Performance evaluation employed stratified three-fold cross-validation with DeLong's test for statistical comparison of receiver operating characteristic curves.ResultsThe multimodal framework achieved 79.88% classification accuracy with area under the receiver operating characteristic curve of 79.65%, representing statistically significant performance enhancement relative to clinical-only (62.53%; p<0.001) and neuroimaging-only (61.53%; p<0.001) models, demonstrating a 17.35 percentage-point improvement attributable to synergistic cross-modal integration rather than simple feature concatenation. Interpretability analyses revealed stochastic activation patterns in unimodal neuroimaging models lacking neuroanatomically coherent feature localization, while clinical feature importance hierarchically prioritized biological sex and motor coordination metrics as primary predictive indicators. The framework maintained robust calibration across probability thresholds, supporting operational feasibility for clinical deployment.DiscussionThis investigation establishes that structural neuroimaging combined with targeted neuropsychological assessment achieves classification performance comparable to complex multimodal protocols while substantially reducing acquisition time and computational requirements, offering a pragmatic pathway for implementing objective AUD screening in high-risk occupational populations with broader implications for psychiatric risk stratification in trauma-exposed professions.