AUTHOR=Ghosh Soham , Mittal Gaurav TITLE=Federated learning for critical electrical infrastructure—handling data heterogeneity for predictive maintenance of substation equipment JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2026 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1697175 DOI=10.3389/frai.2025.1697175 ISSN=2624-8212 ABSTRACT=High-voltage substations form the backbone of critical electrical infrastructure, making predictive maintenance essential for ensuring grid resilience and operational reliability. Federated learning (FL) presents an innovative strategy for predictive maintenance, allowing multiple utility providers to improve model performance jointly while maintaining data confidentiality. Rather than transmitting raw records, each electrical utility performs local model updates and shares only the refined parameters, thereby safeguarding sensitive information and capitalizing on the heterogeneity of equipment conditions across sites. This study develops a set of privacy-preserving FL frameworks to enhance preventive maintenance of substation circuit breakers, large power transformers, and emergency generators. It rigorously tackles the issue of data heterogeneity arising from variations in distribution patterns across utilities, an inherent challenge that hampers effective collaborative model development. Four FL strategies—Federated Averaging (FedAvg and FedAvgM), Federated Proximal (FedProx), and Federated Batch Normalization (FedBN), are evaluated for robustness against distributional shifts. Model performance in this study is evaluated using the F-score, which for the non-IID case ranges from 0.60 to 0.88 depending on the number of clients, the federated learning algorithm used, and the non-IID partitioning strategy employed. Also, a first-of-a kind Federated Information Criterion (FIC) is proposed in this manuscript as an extension of the classical information criterion. The results demonstrate that FedBN is best suited in mitigating cross-utility heterogeneity, yielding highest F-score of 0.88 and a moderately low FIC score of 4.35. Such tailored FL methods significantly improve predictive accuracy, enabling scalable and privacy-preserving deployment of FL in critical power system applications.