深度增強學習在動態資產配置上之應用— 以美國ETF為例 | Publication

Publications-Theses

Article View/Open

pdf(997)

Publication Export

Google Scholar^TM

題名	深度增強學習在動態資產配置上之應用— 以美國ETF為例 The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs
作者	劉上瑋
貢獻者	廖四郎劉上瑋
關鍵詞	動態資產配置深度增強學習 Q-Learning 類神經網路 Dynamic asset allocation Deep reinforcement learning Q-Learning Neural network
日期	2017
上傳時間	1-Nov-2017 14:21:03 (UTC+8)
摘要	增強式學習（Reinforcement Learning）透過與環境不斷的互動來學習，以達到極大化每一期報酬的總和的目標，廣泛被運用於多期的決策過程。基於這些特性，增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。本研究應用Deep Q-Learning演算法建立動態資產配置策略，研究如何在每期不同的環境狀態之下，找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合，以其日報酬率資料進行訓練，並與買進持有策略及固定比例投資策略比較績效，檢視深度增強式學習在動態資產配置適用性。 Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently. In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.
參考文獻	[1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99. [2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover. [3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1). [4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953. [5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48. [6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285. [7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR). [8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. [9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91. [10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA. [11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42. [12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27. [14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22. [15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138. [16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32. [17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press. [18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.
描述	碩士國立政治大學金融學系 104352029
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0104352029
資料類型	thesis

dc.contributor.advisor	廖四郎	zh_TW
dc.contributor.author (Authors)	劉上瑋	zh_TW
dc.creator (作者)	劉上瑋	zh_TW
dc.date (日期)	2017	en_US
dc.date.accessioned	1-Nov-2017 14:21:03 (UTC+8)	-
dc.date.available	1-Nov-2017 14:21:03 (UTC+8)	-
dc.date.issued (上傳時間)	1-Nov-2017 14:21:03 (UTC+8)	-
dc.identifier (Other Identifiers)	G0104352029	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/114285	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	金融學系	zh_TW
dc.description (描述)	104352029	zh_TW
dc.description.abstract (摘要)	增強式學習（Reinforcement Learning）透過與環境不斷的互動來學習，以達到極大化每一期報酬的總和的目標，廣泛被運用於多期的決策過程。基於這些特性，增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。本研究應用Deep Q-Learning演算法建立動態資產配置策略，研究如何在每期不同的環境狀態之下，找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合，以其日報酬率資料進行訓練，並與買進持有策略及固定比例投資策略比較績效，檢視深度增強式學習在動態資產配置適用性。	zh_TW
dc.description.abstract (摘要)	Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently. In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.	en_US
dc.description.tableofcontents	第一章緒論 1 第一節研究背景與動機 1 第二節研究目的 2 第二章文獻探討 3 第一節資產配置 3 第二節增強式學習 7 第三章研究方法 21 第一節建構投資組合 21 第二節資料處理 27 第三節增強式學習系統設計 27 第四節 Deep Q-Network 29 第四章研究結果 33 第一節結果分析 33 第五章結論與建議 34 參考文獻 36	zh_TW
dc.format.extent	1660080 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0104352029	en_US
dc.subject (關鍵詞)	動態資產配置	zh_TW
dc.subject (關鍵詞)	深度增強學習	zh_TW
dc.subject (關鍵詞)	Q-Learning	zh_TW
dc.subject (關鍵詞)	類神經網路	zh_TW
dc.subject (關鍵詞)	Dynamic asset allocation	en_US
dc.subject (關鍵詞)	Deep reinforcement learning	en_US
dc.subject (關鍵詞)	Q-Learning	en_US
dc.subject (關鍵詞)	Neural network	en_US
dc.title (題名)	深度增強學習在動態資產配置上之應用— 以美國ETF為例	zh_TW
dc.title (題名)	The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99. [2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover. [3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1). [4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953. [5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48. [6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285. [7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR). [8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. [9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91. [10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA. [11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42. [12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27. [14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22. [15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138. [16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32. [17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press. [18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM