AUTHOR=Souza Bruno , Castro Manuel , Esmin Ahmed , Machado Leonardo , Ferreira Alexandre , Rocha Anderson TITLE=Causality-driven feature representation for connectivity prediction JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2026 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1686750 DOI=10.3389/frai.2025.1686750 ISSN=2624-8212 ABSTRACT=Causal reasoning is essential for understanding relationships and guiding decision-making in different applications, as it allows for the identification of cause-and-effect relationships between variables. By uncovering the underlying process that drives these relationships, causal reasoning enables more accurate predictions, controlled interventions, and the ability to distinguish genuine causal effects from mere correlations in complex systems. In oil field management, where interactions between injector and producer wells are inherently dynamic, it is vital to uncover causal connections to optimize recovery and minimize waste. Since controlled experiments are impractical in this setting, we must rely solely on observed data. In this paper, we develop an innovative causality-inspired framework that leverages domain expertise for causal feature learning for robust connectivity estimation. We address the challenge posed by confounding factors, latency in system responses, and the complexity of inter-well interactions that complicate causal analysis. First, we frame the problem through a causal lens and propose a novel framework that generates pairwise features driven by causal theory. This method captures meaningful representations of relationships within the oil field system. By constructing independent pairwise feature representations, our method implicitly accounts for confounder signal and enhances the reliability of connectivity estimation. Furthermore, our approach requires only limited context data to train machine learning models that estimate the connectivity probability between injectors and producers. We first validate our methodology through experiments on synthetic and semi-synthetic datasets, ensuring its robustness across varied scenarios. We then apply it to the complex Brazilian Pre-Salt oil fields using public synthetic and real-world data. Our results show that the proposed method effectively identifies injector-producer connectivity while maintaining rapid training times. This enables scalability and provides an interpretable approach for complex dynamic systems through causal theory. While previous projects have employed causal methods in the oil field context, to the best of our knowledge, this is the first time to systematically formulate the problem using causal reasoning that explicitly accounts for relevant confounders and develops an approach that effectively addresses these challenges and facilitates the discovery of interwell connections within an oil field.