AUTHOR=Yang Yan , Wang Tian , Fu Yiding , Huang Jingna , Zhou Dong TITLE=Portfolio management based on value distribution reinforcement learning algorithm JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2026 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1709493 DOI=10.3389/frai.2025.1709493 ISSN=2624-8212 ABSTRACT=IntroductionIn the face of high uncertainty and complexity in financial markets, achieving portfolio return maximization while effectively controlling risk remains a critical challenge.MethodsWe propose a novel portfolio management framework based on the value distribution maximum entropy actor-critic (VD-MEAC) reinforcement learning algorithm. We establish a framework where the agent’s actions represent portfolio weight adjustments and stock factors serve as state observations. For risk management, the critic network learns the complete distribution of future returns. For return enhancement, we incorporate entropy regularization.ResultsWe conduct extensive experiments using real market data from the Chinese stock market. Results demonstrate that our VD-MEAC strategy achieves an average return of 2.490 and an average Sharpe ratio of 2.978, significantly outperforming benchmark strategies.DiscussionThese results validate the effectiveness of our approach in practical portfolio management scenarios.