學術產出-Theses
Article View/Open
Publication Export
-
題名 Double DQN 模型應用於自動股票交易系統
Application of DDQN Model in automated stock trading system作者 柯元富
Ko, Yuan-Fu貢獻者 蔡炎龍
Tsai, Yen-Lung
柯元富
Ko, Yuan-Fu關鍵詞 深度學習
強化學習
Q學習
股票自動交易系統
Deep Learning
Reinforcement Learning
Q-Learning
Automated Stock Trading System日期 2022 上傳時間 1-Feb-2023 13:51:24 (UTC+8) 摘要 本篇文章使用強化學習與深度學習結合,打造股市自動交易系統。除了股市中的原始資料外,也加入了一些投資者常用的技術指標,給定前 10天的資料並使用全連接神經網路以及 Q 學習去訓練系統。訓練系統時,分了兩組來訓練。第一組,把台灣 50 全部的成分股做為訓練資料,並測試其往後 2 年的表現;第二組,取台灣 50 中的 9 支電子股做為訓練資料,並測試其往後 2 年的表現。實驗結果顯示,第一組訓練成果與買入持有策略相比並無明顯差異,而第二組的表現明顯優於買入持有策略。實驗結果證明,DQN 模型於特定情況下在股市自動交易系統會是有效的。
This article uses a combination of reinforcement learning and deep learning to create an automated stock trading system. In addition to the original data from the stock market, some technical indicators commonly used by investors are also added to the system.When training the system, we divided it into two groups. In the first group, all constituent stocks of the Taiwan 50 were used as training data and their performancewas tested for the next 2 years. In the second group, 9 electronic stocks in the Taiwan 50 were used as training data and tested their performance for the next 2years. The results show that there is no significant difference between the first group and the buy-and-hold strategy, while the second group significantly outperforms the buy-and-hold strategy.The experimental results demonstrate that the DQN model is effective in certain situations in the automated stock trading system.參考文獻 [1] D. Z. Anderson, C. Benkert, and D. D. Crouch. Neural Networks for Perception. Neuralnetworks for perception /.[2] Huang Guan Chi. Double q-network in automated stock trading. 2021.[3] M. Lai. Playing atari with deep reinforcement learning.[4] Jae Won Lee, Jonghun Park, O Jangmin, Jongwoo Lee, and Euyseok Hong. A multiagentapproach to q-learning for daily stock trading. IEEE Transactions on Systems, Man, andCybernetics-Part A: Systems and Humans, 37(6):864–877, 2007.[5] Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayerfeedforward networks with a nonpolynomial activation function can approximate anyfunction. Neural networks, 6(6):861–867, 1993.[6] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, YuvalTassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcementlearning. arXiv preprint arXiv:1509.02971, 2015.[7] Jerome H Saltzer, David P Reed, and David D Clark. End-to-end arguments in systemdesign. ACM Transactions on Computer Systems (TOCS), 2(4):277–288, 1984.[8] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang,Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Masteringthe game of go without human knowledge. nature, 550(7676):354–359, 2017.[9] Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradientmethods for reinforcement learning with function approximation. Advances in neuralinformation processing systems, 12, 1999.[10] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning withdouble q-learning. In Proceedings of the AAAI conference on artificial intelligence,volume 30, 2016.[11] Bayya Yegnanarayana. Artificial neural networks. PHI Learning Pvt. Ltd., 2009.[12] Özal Yıldırım, Paweł Pławiak, Ru-San Tan, and U Rajendra Acharya. Arrhythmiadetection using deep convolutional neural network with long duration ecg signals.Computers in biology and medicine, 102:411–420, 2018. 描述 碩士
國立政治大學
應用數學系
109751009資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109751009 資料類型 thesis dc.contributor.advisor 蔡炎龍 zh_TW dc.contributor.advisor Tsai, Yen-Lung en_US dc.contributor.author (Authors) 柯元富 zh_TW dc.contributor.author (Authors) Ko, Yuan-Fu en_US dc.creator (作者) 柯元富 zh_TW dc.creator (作者) Ko, Yuan-Fu en_US dc.date (日期) 2022 en_US dc.date.accessioned 1-Feb-2023 13:51:24 (UTC+8) - dc.date.available 1-Feb-2023 13:51:24 (UTC+8) - dc.date.issued (上傳時間) 1-Feb-2023 13:51:24 (UTC+8) - dc.identifier (Other Identifiers) G0109751009 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/143172 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 應用數學系 zh_TW dc.description (描述) 109751009 zh_TW dc.description.abstract (摘要) 本篇文章使用強化學習與深度學習結合,打造股市自動交易系統。除了股市中的原始資料外,也加入了一些投資者常用的技術指標,給定前 10天的資料並使用全連接神經網路以及 Q 學習去訓練系統。訓練系統時,分了兩組來訓練。第一組,把台灣 50 全部的成分股做為訓練資料,並測試其往後 2 年的表現;第二組,取台灣 50 中的 9 支電子股做為訓練資料,並測試其往後 2 年的表現。實驗結果顯示,第一組訓練成果與買入持有策略相比並無明顯差異,而第二組的表現明顯優於買入持有策略。實驗結果證明,DQN 模型於特定情況下在股市自動交易系統會是有效的。 zh_TW dc.description.abstract (摘要) This article uses a combination of reinforcement learning and deep learning to create an automated stock trading system. In addition to the original data from the stock market, some technical indicators commonly used by investors are also added to the system.When training the system, we divided it into two groups. In the first group, all constituent stocks of the Taiwan 50 were used as training data and their performancewas tested for the next 2 years. In the second group, 9 electronic stocks in the Taiwan 50 were used as training data and tested their performance for the next 2years. The results show that there is no significant difference between the first group and the buy-and-hold strategy, while the second group significantly outperforms the buy-and-hold strategy.The experimental results demonstrate that the DQN model is effective in certain situations in the automated stock trading system. en_US dc.description.tableofcontents 致謝 ii中文摘要 iiiAbstract ivContents vList of Figures vii1 Introduction 12 Related work 23 Deep Learning 33.1 Deep Learning 33.2 Neurons and Neural Network 43.3 Activation Function 73.4 Loss Function 103.5 Gradient Descent 124 Reinforcement Learning 144.1 Reinforcement Learning Framework 144.2 Markov Decision Processes 164.3 Monte Carlo Method and Temporal Difference 185 Deep Reinforcement Learning 195.1 Q-Learning 195.2 Deep Q-Learning Network(DQN) 205.3 Tips For Training Q-Learning Network 215.4 Policy Gradient 236 Automated Stock Trading System 256.1 data preparation 256.1.1 two data sets 256.1.2 Features of data 266.1.3 Normalization 286.2 Trading System Settlement 296.3 Initial Parameter Settlement 306.4 Neural Network 306.5 Result 307 Conclusion and Future work 35Appendix A 附錄編輯 36A.1 附錄內容 36Bibliography 37 zh_TW dc.format.extent 1040318 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109751009 en_US dc.subject (關鍵詞) 深度學習 zh_TW dc.subject (關鍵詞) 強化學習 zh_TW dc.subject (關鍵詞) Q學習 zh_TW dc.subject (關鍵詞) 股票自動交易系統 zh_TW dc.subject (關鍵詞) Deep Learning en_US dc.subject (關鍵詞) Reinforcement Learning en_US dc.subject (關鍵詞) Q-Learning en_US dc.subject (關鍵詞) Automated Stock Trading System en_US dc.title (題名) Double DQN 模型應用於自動股票交易系統 zh_TW dc.title (題名) Application of DDQN Model in automated stock trading system en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] D. Z. Anderson, C. Benkert, and D. D. Crouch. Neural Networks for Perception. Neuralnetworks for perception /.[2] Huang Guan Chi. Double q-network in automated stock trading. 2021.[3] M. Lai. Playing atari with deep reinforcement learning.[4] Jae Won Lee, Jonghun Park, O Jangmin, Jongwoo Lee, and Euyseok Hong. A multiagentapproach to q-learning for daily stock trading. IEEE Transactions on Systems, Man, andCybernetics-Part A: Systems and Humans, 37(6):864–877, 2007.[5] Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayerfeedforward networks with a nonpolynomial activation function can approximate anyfunction. Neural networks, 6(6):861–867, 1993.[6] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, YuvalTassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcementlearning. arXiv preprint arXiv:1509.02971, 2015.[7] Jerome H Saltzer, David P Reed, and David D Clark. End-to-end arguments in systemdesign. ACM Transactions on Computer Systems (TOCS), 2(4):277–288, 1984.[8] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang,Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Masteringthe game of go without human knowledge. nature, 550(7676):354–359, 2017.[9] Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradientmethods for reinforcement learning with function approximation. Advances in neuralinformation processing systems, 12, 1999.[10] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning withdouble q-learning. In Proceedings of the AAAI conference on artificial intelligence,volume 30, 2016.[11] Bayya Yegnanarayana. Artificial neural networks. PHI Learning Pvt. Ltd., 2009.[12] Özal Yıldırım, Paweł Pławiak, Ru-San Tan, and U Rajendra Acharya. Arrhythmiadetection using deep convolutional neural network with long duration ecg signals.Computers in biology and medicine, 102:411–420, 2018. zh_TW