首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Online portfolio management via deep reinforcement learning with high-frequency data
Institution:1. School of Information Management, Nanjing University, Nanjing 210023, China;2. School of International and Public Affairs, Shanghai Jiao Tong University, Shanghai 200230, China;1. Renmin University of China, Beijing, China;2. Copenhagen Business School, Frederiksberg, Denmark;1. School of Information Management, Wuhan University, Wuhan, China;2. Center for Studies of Information Resources, Wuhan University, Wuhan, China;1. School of Information, Renmin University of China, Beijing 100872, PR China;2. School of Information Technology and Management, University of International Business and Economics, Beijing 100029, PR China
Abstract:Recently, models that based on Transformer (Vaswani et al., 2017) have yielded superior results in many sequence modeling tasks. The ability of Transformer to capture long-range dependencies and interactions makes it possible to apply it in the field of portfolio management (PM). However, the built-in quadratic complexity of the Transformer prevents its direct application to the PM task. To solve this problem, in this paper, we propose a deep reinforcement learning-based PM framework called LSRE-CAAN, with two important components: a long sequence representations extractor and a cross-asset attention network. Direct Policy Gradient is used to solve the sequential decision problem in the PM process. We conduct numerical experiments in three aspects using four different cryptocurrency datasets, and the empirical results show that our framework is more effective than both traditional and state-of-the-art (SOTA) online portfolio strategies, achieving a 6x return on the best dataset. In terms of risk metrics, our framework has an average volatility risk of 0.46 and an average maximum drawdown risk of 0.27 across the four datasets, both of which are lower than the vast majority of SOTA strategies. In addition, while the vast majority of SOTA strategies maintain a poor turnover rate of approximately greater than 50% on average, our framework enjoys a relatively low turnover rate on all datasets, efficiency analysis illustrates that our framework no longer has the quadratic dependency limitation.
Keywords:Portfolio management  Deep reinforcement learning  Cryptocurrency  Bitcoin  Online learning  High-frequency trading
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号