Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 深度增強學習在動態資產配置上之應用— 以美國ETF為例
The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFs
作者 劉上瑋
貢獻者 廖四郎
劉上瑋
關鍵詞 動態資產配置
深度增強學習
Q-Learning
類神經網路
Dynamic asset allocation
Deep reinforcement learning
Q-Learning
Neural network
日期 2017
上傳時間 1-Nov-2017 14:21:03 (UTC+8)
摘要 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。
本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。
Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.
In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.
參考文獻 [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.

[2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.

[3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).

[4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.

[5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.

[6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.

[7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).

[8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

[9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.

[10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.

[11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.

[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

[13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.

[14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.

[15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.

[16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.

[17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

[18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.
描述 碩士
國立政治大學
金融學系
104352029
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0104352029
資料類型 thesis
dc.contributor.advisor 廖四郎zh_TW
dc.contributor.author (Authors) 劉上瑋zh_TW
dc.creator (作者) 劉上瑋zh_TW
dc.date (日期) 2017en_US
dc.date.accessioned 1-Nov-2017 14:21:03 (UTC+8)-
dc.date.available 1-Nov-2017 14:21:03 (UTC+8)-
dc.date.issued (上傳時間) 1-Nov-2017 14:21:03 (UTC+8)-
dc.identifier (Other Identifiers) G0104352029en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/114285-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 金融學系zh_TW
dc.description (描述) 104352029zh_TW
dc.description.abstract (摘要) 增強式學習(Reinforcement Learning)透過與環境不斷的互動來學習,以達到極大化每一期報酬的總和的目標,廣泛被運用於多期的決策過程。基於這些特性,增強式學習可以應用於建立需不斷動態調整投資組合配置比例的動態資產配置策略。
本研究應用Deep Q-Learning演算法建立動態資產配置策略,研究如何在每期不同的環境狀態之下,找出最佳的配置權重。採用2007年7月2日至2017年6月30日的美國中大型股的股票ETF及投資等級的債券ETF建立投資組合,以其日報酬率資料進行訓練,並與買進持有策略及固定比例投資策略比較績效,檢視深度增強式學習在動態資產配置適用性。
zh_TW
dc.description.abstract (摘要) Reinforcement learning learns by interacting with the environment continuously, in order to achieve the target of maximizing the sum of each return. It has been used to solve multi-period decision making problem broadly. Because of these characteristics, reinforcement learning can be applied to build the strategies of dynamic asset allocation which keep reallocating the mix of portfolio consistently.
In this study, we apply deep Q-Learning algorithm to build the strategies of dynamic asset allocation. Studying how to find the optimal weights in the different environment. We use Large-Cap, Mid-Cap ETFs and investment-grade bond ETFs in the U.S. to build up the portfolio. We train the model with the data of daily return, and then we measure its performance by comparing with buy-and-hold and constant-mix strategy to check the fitness of deep Q-Learning.
en_US
dc.description.tableofcontents 第一章 緒論 1
第一節 研究背景與動機 1
第二節 研究目的 2
第二章 文獻探討 3
第一節 資產配置 3
第二節 增強式學習 7
第三章 研究方法 21
第一節 建構投資組合 21
第二節 資料處理 27
第三節 增強式學習系統設計 27
第四節 Deep Q-Network 29
第四章 研究結果 33
第一節 結果分析 33
第五章 結論與建議 34
參考文獻 36
zh_TW
dc.format.extent 1660080 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0104352029en_US
dc.subject (關鍵詞) 動態資產配置zh_TW
dc.subject (關鍵詞) 深度增強學習zh_TW
dc.subject (關鍵詞) Q-Learningzh_TW
dc.subject (關鍵詞) 類神經網路zh_TW
dc.subject (關鍵詞) Dynamic asset allocationen_US
dc.subject (關鍵詞) Deep reinforcement learningen_US
dc.subject (關鍵詞) Q-Learningen_US
dc.subject (關鍵詞) Neural networken_US
dc.title (題名) 深度增強學習在動態資產配置上之應用— 以美國ETF為例zh_TW
dc.title (題名) The Application of Deep Reinforcement Learning on Dynamic Asset Allocation : A Case Study of U.S. ETFsen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Arnott, R. D., Hsu, J., & Moore, P. (2005). Fundamental indexation. Financial Analysts Journal, 61(2), 83-99.

[2] Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003: Dover.

[3] Daryanani, G. (2008). Opportunistic Rebalancing: A New Paradigm for Wealth Managers. Journal of Financial Planning, 21(1).

[4] DeMiguel, V., Garlappi, L., & Uppal, R. (2007). Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?. The review of Financial studies, 22(5), 1915-1953.

[5] Brinson, G. P., Singer, B. D., & Beebower, G. L. (1991). Determinants of portfolio performance II: An update. Financial Analysts Journal, 47(3), 40-48.

[6] Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of artificial intelligence research, 4, 237-285.

[7] Kinga, D., & Adam, J. B. (2015). A method for stochastic optimization. In International Conference on Learning Representations (ICLR).

[8] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.

[9] Markowitz, H. (1952). Portfolio selection. The journal of finance, 7(1), 77-91.

[10] Michaud, R. O. (1998). Efficient Asset Management: a practical guide to stock portfolio management and asset allocation. Financial Management Association, Survey and Synthesis Series. HBS Press, Boston, MA.

[11] Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’optimal?. Financial Analysts Journal, 45(1), 31-42.

[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, L., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D. & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

[13] Perold, A. F., & Sharpe, W. F. (1988). Dynamic strategies for asset allocation. Financial Analysts Journal, 16-27.

[14] Plaxco, L. M., & Arnott, R. D. (2002). Rebalancing a global policy benchmark. The Journal of Portfolio Management, 28(2), 9-22.

[15] Sharpe, W. F. (1966). Mutual fund performance. The Journal of business, 39(1), 119-138.

[16] Sharpe, W. F. (1987). Integrated asset allocation. Financial Analysts Journal, 43(5), 25-32.

[17] Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

[18] Watkins, C. J., & Dayan, P. (1992). Q-learning. Machine learning, 8(3-4), 279-292.
zh_TW