Porfolio Optimization based on Reinforcement Learning

Summary

Portfolio Optimization (PO) is a financial engineering technique that rebalances resources across multiple risky assets at a particular moment to optimize returns and minimize risks. This study reveals two methods (static and dynamic) for PO. The static method utilized Maximum Sharpe Ratio to determine allocation weights that optimize prior profits. The dynamic approach employs Deep Reinforcement Learning (DRL) to build a model that analyzes stock movement and recommends immediate trading actions (buy/sell/hold). The dynamic method utilizes Twin Delayed Deep Deterministic Policy Gradient (TD3) because it performs well in continuous action spaces without producing excessive biases. The purpose of this study is to evaluate the performance of TD3 on relatively small datasets. Experiments using criteria such as expected returns and Sharpe Ratio indicate that DRL models outperform baseline models. Hyper-parameter tuning has a beneficial effect overall.

Avatar
Guoxin Sun
Advisory Engineer

Data Scientist, empowering the data-driven business leadership.

Publications

Used TD3 and DDPG to perform portfolio optimization on a small dataset.

Talks

2022 International Conference on Computer Science, Information Engineering and Digital Economy (CSIEDE2022)