AUTHOR=Kushwaha Neetu , Singh Alok , Sheikh Hassan Aftab TITLE=NatureKG: an ontology and knowledge graph for nature finance with a Text2Cypher application JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1693843 DOI=10.3389/frai.2025.1693843 ISSN=2624-8212 ABSTRACT=IntroductionNature finance involves complex, multi-dimensional challenges that require analytical frameworks to assess risks, impacts, dependencies, and systemic resilience. Existing financial systems lack structured tools to map dependencies between natural capital and financial assets. To address this, we introduce NatureKG, the first ontology and instantiated knowledge graph (KG) specifically tailored to nature finance, aiming to support financial institutions in assessing environmental risks, impacts, and dependencies systematically.MethodsWe designed a domain ontology grounded in ENCORE, the Science-Based Targets Network (SBTN), and peer-reviewed literature. This ontology defines entities such as Actions, Drivers of Nature Loss, Value Chains, Evidence, and Sources. The ontology was instantiated into NatureKG within Neo4j, consisting of 320 nodes and 540 relationships curated by domain experts. As a proof of concept, we constructed a Text2Cypher dataset and fine-tuned three open-source large language models (Phi-3, LLaMA-3.1-8B, and Mistral-7B) to translate natural language queries into Cypher graph queries. The models were trained and evaluated under different dataset split strategies (paraphrase, cypher-level, and generalization) using metrics such as BLEU, exact match, execution accuracy, and Macro F1 scores.ResultsPhi-3 achieved the highest execution accuracy (0.21) and Macro F1 score (0.56), demonstrating better structural and reasoning capability under paraphrase and schema generalization splits. LLaMA-3.1-8B exhibited balanced performance, while Mistral-7B lagged across most metrics. The results indicate that smaller, fine-tuned models can generalize effectively in low-resource, domain-specific settings, validating the feasibility of LLM-assisted querying for nature finance.DiscussionDespite modest initial accuracy, this feasibility study establishes a baseline for integrating domain-specific ontologies with AI systems. NatureKG offers a reusable foundation for representing environmental risks, dependencies, and interventions, with potential to enhance transparency and scalability in sustainable finance decision support. Future work should expand dataset diversity, sectoral coverage beyond the built environment, and refine model reasoning through larger, domain-aligned data catalogues.