AUTHOR=Kiragga Agnes N. , Iddi Samuel , Walekhwa Abel W. , Barasa Miranda , Cygu Steve , Odhiambo Rachel , Gningue Moctar , Mboup Aminata , Onana Anicet , Adnew Bethlehem , Alemu Ashuro Akililu , Hudson Simon , Greenfield Jay , Todd Jim , Bhattacharjee Tathagata , Sharan Malvika , Sonabend Raphael , Kadengye Damazo , Mbatchou Bertrand Hugo , Abdissa Alemseged , Sarr Moussa , Nabende Joyce Nakatumba , Bamutura Moses , Tamurat Bekure , Dereje Nebiyu , Temfack Elvis TITLE=Data science without borders: bridging the divide in data science capacity across African health institutions JOURNAL=Frontiers in Public Health VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2025.1695907 DOI=10.3389/fpubh.2025.1695907 ISSN=2296-2565 ABSTRACT=BackgroundEffective public health data science in Africa requires a comprehensive understanding of institutional capabilities across multiple dimensions. This study conducted a multidimensional assessment of three African health institutions to examine the availability of health data, healthcare training, data governance needs, and infrastructure capabilities, to inform the use of data science tools to address health challenges.MethodsThe study is a baseline assessment for the Data Science Without Borders Project—a three-year multi-country project implemented in three African health institutions: the Institute for Health Research, Epidemiological Surveillance and Training (IRESSEF) in Senegal, Armauer Hansen Research Institute (AHRI) in Ethiopia, and the Douala General Hospital (DGH) in Cameroon. We designed a baseline structured needs assessment survey to assess: (1) health data availability across sixteen (16) dataset categories; (2) training needs across seven (7) domains, data governance considerations; and (3) infrastructure capabilities, including computing resources, connectivity, and service availability. We then conducted an integrated analysis to identify patterns, gaps, and opportunities across various dimensions, informing project implementation.ResultsThe assessment revealed different institutional profiles with complementary strengths and limitations, which are critical for the effective use of data science tools. IRESSEF demonstrated rich data resources (particularly in genomics, maternal health, and geographical health differences), moderate infrastructure limitations (8GB RAM, 67% service capability), and high training needs (data & analytics: 4.7/5.0, data governance: 4.0/5.0). AHRI exhibited superior computing resources (512GB RAM, 64 CPU cores), specialized surveillance data (9.9%), and moderate training needs (average: 3.0/5.0). DGH demonstrated focused strengths in infectious disease research (3.3%), moderate computing resources (32 GB RAM), and large opportunities to use electronic health records for research. Common priorities across institutions included the need for enhancing data & analytical capabilities (average: 4.3/5.0) and use of advanced [artificial intelligence and machine learning analysis techniques (IRESSEF: 5.0, AHRI: 4.0, DGH: 5.0)], and very importantly, the need to establish data governance structures to increase the ability and capacity of the partners to share data for consortium collaborative analyses across Africa.ConclusionOur integrated assessment suggests that effective capacity building requires moving beyond standardized approaches to embrace a phased model that leverages institutional needs and complementarities. We recommend: (1) establishing robust data governance frameworks as a foundation; (2) implementing a phased and customized approach where institutions receive training according to their immediate demands and strengths; (3) addressing critical infrastructure gaps to support data. We are involved in science projects in Africa that support federated analyses to maintain data sovereignty. This approach offers potential for a varying African approach to health data science, which could extend to AI adoption and broader continental collaboration.