增強式學習建構臺灣股價指數期貨之交易策略

學術產出-Theses

Article View/Open

pdf(62)

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

Simple Record
Full Record

題名	增強式學習建構臺灣股價指數期貨之交易策略 Reinforcement Learning to Construct TAIFX Trading Strategies
作者	洪子軒 Hong, Tzu-Hsuan
貢獻者	林士貴<br>蔡瑞煌洪子軒 Hong, Tzu-Hsuan
關鍵詞	演算法交易臺股期貨機器學習增強式學習法 SARSA Q-Learning DQN Algorithm trading Taiwan stock index future Machine learning SARSA Q-Learning DQN Reinforcement learning
日期	2018
上傳時間	31-Jul-2018 13:45:49 (UTC+8)
摘要	機器學習與人工智慧的技術能夠應於金融交易之決策，並獲得創新的交易策略，本研究則希望發掘增強式學習法應用於金融交易之決策領域之可能。增強式學習法利用建構學習代理人(RL-agent)與環境交流的方式，具有自主學習策略並優化的能力，其所擁有的環境探索(Exploitation)及延遲報酬(Delayed Reward)兩項特性，與應用於金融市場的交易策略建構之問題不謀而合，因此本研究採用增強式學習法來建立臺灣股價指數期貨的交易策略。在研究的設計上，我們嘗試了三種不同的實驗設計方式、採用 Q-learning、SARSA以及DQN 三種不同的演算法進行討論。我們將 2007 年 7 月 1 日至 2017 年 12 月 31 日之臺灣股價指數期貨歷史資料設定為研究之標的，並在此區間訓練模型並分析績效表現。透過實證結果發現，在合理的實驗設計下，學習代理人能通過增強式學習模型建構出得超越大盤並穩定獲利之交易策略。 Reinforcement Learning features the self-learning ability on strategy construction and optimization by forming the way in which RL-agent interact with environment. Two characteristics of reinforcement learning, interacting with environment and delayed reward, can be applied on decision control system, such as constructing trading strategy. Therefore, this research is to build the trading strategy on TWSE futures index by adopting reinforcement learning. In terms of system design, we examine three kinds of situation definition and algorithm, including Q-learning, SARSA and DQN. To test the availability, this article utilizes TWSE futures historical data (2007/7/1-2017/12/31) to conduct learning training and performance examination. Our findings illustrate that RL-agent would be able to construct the trading strategy which defeats the market and make profits steadily if environment is effectively defined. Moreover, the results conclude that machine learning and artificial intelligence are in favor of decisions on financial trading and pioneering trading strategy creation.
參考文獻	[1] Bekiros S. D. (2010), Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach, Journal of Economic Dynamics & Control,34 (6), 1153-1170. [2] Bellman, R. E. (1957). Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003. [3] Fama, E. F., (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance, 25(2), 383-417. [4] Gold, C. (2003). FX trading via recurrent Reinforcement Learning, Proceedings of the IEEE International Conference on Computational Intelligence in Financial Engineering, 363-370. [5] Irwin, S. H. and Park, C. H., (2007). What Do We Know About the Profitability of Technical Analysis? Journal of Economic Surveys, 21(4), 786–826. [6] Kearns, M., and Nevmyvaka, Y. (2013). Machine learning for market microstructure and high frequency trading. In: Easley D., López de Prado M., O’Hara M. (Eds.) High-Frequency Trading – New Realities for Traders, Markets and Regulators, 91-124. [7] Lu, T. H. and Y. C. Chen, (2015). Trend definition or holding strategy: What determines the profitability of candlestick charting? Journal of Banking & Finance, 61, 172-183. [8] Moody, J., and Saffel, M. (2001), Learning to trade via Direct Reinforcement, IEEE Transactions on Neural Network, 12, 875-889. [9] Moody, J., Wu, L., Liao Y., and Saffel M. (1998), Performance functions and Reinforcement Learning for trading systems and portfolios, Journal of Forecasting, 17 (56), 441-470. [10] Moody, J. and Wu, L. (1997). Optimization of trading systems and portfolios, in Y. Abu-Mostafa, A. N. Refenes & A. S. Weigend, eds, `Decision Technologies for Financial Engineering`, World Scientific, London, 23-35. [11] O, J., Lee, J., Lee, J. W., and Zhang, B.-T. (2006). Adaptive stock trading with dynamic asset allocation using reinforcement learning, Information Sciences, 176 (15), 2121-2147. [12] Richard, S. S. and Andrew, G. B., (1998). Reinforcement Learning: An Introduction. MIT Press. [13] Volodymyr, M., Koray, K., David, S., Alex, G., Ioannis, A., Daan, W., and Martin, R., (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. [14] Volodymyr, M., Koray K., David S., Andrei A. R., Joel V., Marc G. B., Alex G., Martin R., Andreas K. F., Georg O., (2015). Human-level control through deep reinforcement learning. Nature 518(7540): 529–533, 201.
描述	碩士國立政治大學金融學系 105352020
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0105352020
資料類型	thesis

dc.contributor.advisor	林士貴<br>蔡瑞煌	zh_TW
dc.contributor.author (Authors)	洪子軒	zh_TW
dc.contributor.author (Authors)	Hong, Tzu-Hsuan	en_US
dc.creator (作者)	洪子軒	zh_TW
dc.creator (作者)	Hong, Tzu-Hsuan	en_US
dc.date (日期)	2018	en_US
dc.date.accessioned	31-Jul-2018 13:45:49 (UTC+8)	-
dc.date.available	31-Jul-2018 13:45:49 (UTC+8)	-
dc.date.issued (上傳時間)	31-Jul-2018 13:45:49 (UTC+8)	-
dc.identifier (Other Identifiers)	G0105352020	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/119091	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	金融學系	zh_TW
dc.description (描述)	105352020	zh_TW
dc.description.abstract (摘要)	機器學習與人工智慧的技術能夠應於金融交易之決策，並獲得創新的交易策略，本研究則希望發掘增強式學習法應用於金融交易之決策領域之可能。增強式學習法利用建構學習代理人(RL-agent)與環境交流的方式，具有自主學習策略並優化的能力，其所擁有的環境探索(Exploitation)及延遲報酬(Delayed Reward)兩項特性，與應用於金融市場的交易策略建構之問題不謀而合，因此本研究採用增強式學習法來建立臺灣股價指數期貨的交易策略。在研究的設計上，我們嘗試了三種不同的實驗設計方式、採用 Q-learning、SARSA以及DQN 三種不同的演算法進行討論。我們將 2007 年 7 月 1 日至 2017 年 12 月 31 日之臺灣股價指數期貨歷史資料設定為研究之標的，並在此區間訓練模型並分析績效表現。透過實證結果發現，在合理的實驗設計下，學習代理人能通過增強式學習模型建構出得超越大盤並穩定獲利之交易策略。	zh_TW
dc.description.abstract (摘要)	Reinforcement Learning features the self-learning ability on strategy construction and optimization by forming the way in which RL-agent interact with environment. Two characteristics of reinforcement learning, interacting with environment and delayed reward, can be applied on decision control system, such as constructing trading strategy. Therefore, this research is to build the trading strategy on TWSE futures index by adopting reinforcement learning. In terms of system design, we examine three kinds of situation definition and algorithm, including Q-learning, SARSA and DQN. To test the availability, this article utilizes TWSE futures historical data (2007/7/1-2017/12/31) to conduct learning training and performance examination. Our findings illustrate that RL-agent would be able to construct the trading strategy which defeats the market and make profits steadily if environment is effectively defined. Moreover, the results conclude that machine learning and artificial intelligence are in favor of decisions on financial trading and pioneering trading strategy creation.	en_US
dc.description.tableofcontents	目錄第一章緒論 1 第一節研究動機 1 第二節研究目的 1 第三節論文架構 2 第二章文獻探討 3 第一節效率市場假說 3 第二節技術面分析 4 第三節增強式學習之相關文獻 4 第三章研究方法 5 第一節增強式學習 5 第二節時間差分法 7 第三節增強式學習模型架構 8 第四節研究標的及研究選用資料 13 第五節實驗架構 16 第四章實驗結果與分析 21 第一節模型績效總覽 21 模型一 21 模型二 24 模型三 27 模型四 30 模型五 33 模型六 36 模型七 39 模型八 42 模型九 45 第二節模型績效比較分析 48 第五章總結與展望 49 第一節總結 49 第二節未來展望 49 參考文獻 51	zh_TW
dc.format.extent	2720883 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0105352020	en_US
dc.subject (關鍵詞)	演算法交易	zh_TW
dc.subject (關鍵詞)	臺股期貨	zh_TW
dc.subject (關鍵詞)	機器學習	zh_TW
dc.subject (關鍵詞)	增強式學習法	zh_TW
dc.subject (關鍵詞)	SARSA	zh_TW
dc.subject (關鍵詞)	Q-Learning	zh_TW
dc.subject (關鍵詞)	DQN	zh_TW
dc.subject (關鍵詞)	Algorithm trading	en_US
dc.subject (關鍵詞)	Taiwan stock index future	en_US
dc.subject (關鍵詞)	Machine learning	en_US
dc.subject (關鍵詞)	SARSA	en_US
dc.subject (關鍵詞)	Q-Learning	en_US
dc.subject (關鍵詞)	DQN	en_US
dc.subject (關鍵詞)	Reinforcement learning	en_US
dc.title (題名)	增強式學習建構臺灣股價指數期貨之交易策略	zh_TW
dc.title (題名)	Reinforcement Learning to Construct TAIFX Trading Strategies	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Bekiros S. D. (2010), Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach, Journal of Economic Dynamics & Control,34 (6), 1153-1170. [2] Bellman, R. E. (1957). Dynamic Programming. Princeton University Press, Princeton, NJ. Republished 2003. [3] Fama, E. F., (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance, 25(2), 383-417. [4] Gold, C. (2003). FX trading via recurrent Reinforcement Learning, Proceedings of the IEEE International Conference on Computational Intelligence in Financial Engineering, 363-370. [5] Irwin, S. H. and Park, C. H., (2007). What Do We Know About the Profitability of Technical Analysis? Journal of Economic Surveys, 21(4), 786–826. [6] Kearns, M., and Nevmyvaka, Y. (2013). Machine learning for market microstructure and high frequency trading. In: Easley D., López de Prado M., O’Hara M. (Eds.) High-Frequency Trading – New Realities for Traders, Markets and Regulators, 91-124. [7] Lu, T. H. and Y. C. Chen, (2015). Trend definition or holding strategy: What determines the profitability of candlestick charting? Journal of Banking & Finance, 61, 172-183. [8] Moody, J., and Saffel, M. (2001), Learning to trade via Direct Reinforcement, IEEE Transactions on Neural Network, 12, 875-889. [9] Moody, J., Wu, L., Liao Y., and Saffel M. (1998), Performance functions and Reinforcement Learning for trading systems and portfolios, Journal of Forecasting, 17 (56), 441-470. [10] Moody, J. and Wu, L. (1997). Optimization of trading systems and portfolios, in Y. Abu-Mostafa, A. N. Refenes & A. S. Weigend, eds, `Decision Technologies for Financial Engineering`, World Scientific, London, 23-35. [11] O, J., Lee, J., Lee, J. W., and Zhang, B.-T. (2006). Adaptive stock trading with dynamic asset allocation using reinforcement learning, Information Sciences, 176 (15), 2121-2147. [12] Richard, S. S. and Andrew, G. B., (1998). Reinforcement Learning: An Introduction. MIT Press. [13] Volodymyr, M., Koray, K., David, S., Alex, G., Ioannis, A., Daan, W., and Martin, R., (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. [14] Volodymyr, M., Koray K., David S., Andrei A. R., Joel V., Marc G. B., Alex G., Martin R., Andreas K. F., Georg O., (2015). Human-level control through deep reinforcement learning. Nature 518(7540): 529–533, 201.	zh_TW
dc.identifier.doi (DOI)	10.6814/THE.NCCU.MB.023.2018.F06	-

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM