AUTHOR=Blazquez-Folch Josep , Limones Andrade María , Calm Berta , Auñón García Juan Miguel , Alegret Montserrat , Muñoz Nathalia , Cano Amanda , Fernández Victoria , García-Gutiérrez Fernando , De Rojas Itziar , García-González Pablo , Olivé Clàudia , Puerta Raquel , Capdevila-Bayo María , Muñoz-Morales Álvaro , Bayón-Buján Paula , Miguel Andrea , Montrreal Laura , Espinosa Ana , Sanz-Cartagena Pilar , Rosende-Roca Maitee , Zaldua Carla , Gabirondo Peru , Cantero-Fortiz Yahveth , Gurruchaga Miren Jone , Tarraga Lluis , Boada Mercè , Ruiz Agustín , Marquié Marta , Valero Sergi TITLE=Federated learning for cognitive impairment detection using speech data JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1662859 DOI=10.3389/frai.2025.1662859 ISSN=2624-8212 ABSTRACT=IntroductionIn Alzheimer’s disease (AD) research, clinical, neuroimaging, genetic, and biomarker data are vital for advancing its understanding and treatment. However, privacy concerns and limited datasets complicate data sharing. Federated learning (FL) offers a solution by enabling collaborative research while preserving data privacy.MethodsThis study analyzed data from patients assessed at the Memory Unit of the Ace Alzheimer Center Barcelona who completed a standardized digital speech protocol. Acoustic features extracted from these recordings were used to distinguish between cognitively unimpaired (CU) and cognitively impaired (CI) individuals. The aim was to evaluate how data heterogeneity impacted the FL model performance across three scenarios: (1) equal contributions and class ratios, (2) unequal contributions, and (3) imbalanced class ratios. In each scenario, the performance of local models trained using an MLP feed-forward neural network on institutional data was analyzed and compared to a global model created by aggregating these local models using Federated Averaging (FedAvg) and Iterative Data Aggregation (IDA).ResultsThe cohort included 2,239 participants: 221 CU individuals (mean age 66.8, 64.7% female) and 2,018 CI subjects, comprising 1,219 with mild cognitive impairment (mean age 74.3, 61.9% female) and 799 with mild AD dementia (mean age 80.8, 64.8% female). In scenarios 1 and 3, FL provided modest gains in accuracy and AUC. In scenario 2, FL markedly improved performance for the smaller dataset (balanced accuracy rising from 0.51 to 0.80) while preserving 0.86 accuracy in the larger dataset, highlighting scalability across heterogeneous conditions.ConclusionThese findings demonstrate the potential of FL to enable collaborative modeling of speech-based biomarkers for cognitive impairment detection, even under conditions of data imbalance and institutional disparity. This work highlights FL as a scalable and privacy-preserving approach for advancing digital health research in neurodegenerative diseases.