Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 深度增強學習在動態資產配置上之應用— 以美國ETF為例
The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs作者 劉上瑋 貢獻者 廖四郎
劉上瑋關鍵詞 動態資產配置
深度增強學習
Q-Learning
類神經網路
Dynamic asset allocation
Deep reinforcement learning
Q-Learning
Neural network日期 2017 上傳時間 1-Nov-2017 14:21:03 (UTC+8) 摘要 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。 本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。
Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.參考文獻 [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.[2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.[3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).[4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.[5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.[6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.[7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).[8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.[9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.[10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.[11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.[13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.[14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.[15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.[16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.[17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.[18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292. 描述 碩士
國立政治大學
金融學系
104352029資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104352029 資料類型 thesis dc.contributor.advisor 廖四郎 zh_TW dc.contributor.author (Authors) 劉上瑋 zh_TW dc.creator (作者) 劉上瑋 zh_TW dc.date (日期) 2017 en_US dc.date.accessioned 1-Nov-2017 14:21:03 (UTC+8) - dc.date.available 1-Nov-2017 14:21:03 (UTC+8) - dc.date.issued (上傳時間) 1-Nov-2017 14:21:03 (UTC+8) - dc.identifier (Other Identifiers) G0104352029 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/114285 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 金融學系 zh_TW dc.description (描述) 104352029 zh_TW dc.description.abstract (摘要) 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。 本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。 zh_TW dc.description.abstract (摘要) Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning. en_US dc.description.tableofcontents 第一章 緒論 1 第一節 研究背景與動機 1 第二節 研究目的 2第二章 文獻探討 3 第一節 資產配置 3 第二節 增強式學習 7第三章 研究方法 21 第一節 建構投資組合 21 第二節 資料處理 27 第三節 增強式學習系統設計 27 第四節 Deep Q-Network 29第四章 研究結果 33 第一節 結果分析 33第五章 結論與建議 34參考文獻 36 zh_TW dc.format.extent 1660080 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104352029 en_US dc.subject (關鍵詞) 動態資產配置 zh_TW dc.subject (關鍵詞) 深度增強學習 zh_TW dc.subject (關鍵詞) Q-Learning zh_TW dc.subject (關鍵詞) 類神經網路 zh_TW dc.subject (關鍵詞) Dynamic asset allocation en_US dc.subject (關鍵詞) Deep reinforcement learning en_US dc.subject (關鍵詞) Q-Learning en_US dc.subject (關鍵詞) Neural network en_US dc.title (題名) 深度增強學習在動態資產配置上之應用— 以美國ETF為例 zh_TW dc.title (題名) The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.[2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.[3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).[4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.[5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.[6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.[7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).[8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.[9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.[10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.[11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.[13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.[14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.[15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.[16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.[17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.[18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292. zh_TW
