AUTHOR=Tsurumi Takayuki , Morita Kenji 

TITLE=A neural network model combining the successor representation and actor-critic methods reveals effective biological use of the representation

JOURNAL=Frontiers in Computational Neuroscience

VOLUME=Volume 19 - 2025

YEAR=2025

URL=https://www.frontiersin.org/journals/computational-neuroscience/articles/10.3389/fncom.2025.1647462

DOI=10.3389/fncom.2025.1647462

ISSN=1662-5188

ABSTRACT=In learning goal-directed behavior, state representation is important for adapting to the environment and achieving goals. A predictive state representation called successive representation (SR) has recently attracted attention as a candidate for state representation in animal brains, especially in the hippocampus. The relationship between the SR and the animal brain has been studied, and several neural network models for computing the SR have been proposed based on the findings. However, studies on implementation of the SR involving action selection have not yet advanced significantly. Therefore, we explore possible mechanisms by which the SR is utilized biologically for action selection and learning optimal action policies. The actor-critic architecture is a promising model of animal behavioral learning in terms of its correspondence to the anatomy and function of the basal ganglia, so it is suitable for our purpose. In this study, we construct neural network models for behavioral learning using the SR. By using them to perform reinforcement learning, we investigate their properties. Specifically, we investigated the effect of using different state representations for the actor and critic in the actor-critic method, and also compared the actor-critic method with Q-learning and SARSA. We found the difference between the effect of using the SR for the actor and the effect of using the SR for the critic in the actor-critic method, and observed that using the SR in conjunction with one-hot encoding makes it possible to learn with the benefits of both representations. These results suggest the possibility that the striatum can learn using multiple state representations complementarily.