學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 Double DQN 模型應用於自動股票交易系統
Application of DDQN Model in automated stock trading system
作者 柯元富
Ko, Yuan-Fu
貢獻者 蔡炎龍
Tsai, Yen-Lung
柯元富
Ko, Yuan-Fu
關鍵詞 深度學習
強化學習
Q學習
股票自動交易系統
Deep Learning
Reinforcement Learning
Q-Learning
Automated Stock Trading System
日期 2022
上傳時間 1-Feb-2023 13:51:24 (UTC+8)
摘要 本篇文章使用強化學習與深度學習結合,打造股市自動交易系統。除
了股市中的原始資料外,也加入了一些投資者常用的技術指標,給定前 10
天的資料並使用全連接神經網路以及 Q 學習去訓練系統。
訓練系統時,分了兩組來訓練。第一組,把台灣 50 全部的成分股做為
訓練資料,並測試其往後 2 年的表現;第二組,取台灣 50 中的 9 支電子股做為訓練資料,並測試其往後 2 年的表現。實驗結果顯示,第一組訓練成果與買入持有策略相比並無明顯差異,而第二組的表現明顯優於買入持有策略。
實驗結果證明,DQN 模型於特定情況下在股市自動交易系統會是有效
的。
This article uses a combination of reinforcement learning and deep learning to create an automated stock trading system. In addition to the original data from the stock market, some technical indicators commonly used by investors are also added to the system.
When training the system, we divided it into two groups. In the first group, all constituent stocks of the Taiwan 50 were used as training data and their performance
was tested for the next 2 years. In the second group, 9 electronic stocks in the Taiwan 50 were used as training data and tested their performance for the next 2
years. The results show that there is no significant difference between the first group and the buy-and-hold strategy, while the second group significantly outperforms the buy-and-hold strategy.
The experimental results demonstrate that the DQN model is effective in certain situations in the automated stock trading system.
參考文獻 [1] D. Z. Anderson, C. Benkert, and D. D. Crouch. Neural Networks for Perception. Neural
networks for perception /.
[2] Huang Guan Chi. Double q-network in automated stock trading. 2021.
[3] M. Lai. Playing atari with deep reinforcement learning.
[4] Jae Won Lee, Jonghun Park, O Jangmin, Jongwoo Lee, and Euyseok Hong. A multiagent
approach to q-learning for daily stock trading. IEEE Transactions on Systems, Man, and
Cybernetics-Part A: Systems and Humans, 37(6):864–877, 2007.
[5] Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayer
feedforward networks with a nonpolynomial activation function can approximate any
function. Neural networks, 6(6):861–867, 1993.
[6] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval
Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement
learning. arXiv preprint arXiv:1509.02971, 2015.
[7] Jerome H Saltzer, David P Reed, and David D Clark. End-to-end arguments in system
design. ACM Transactions on Computer Systems (TOCS), 2(4):277–288, 1984.
[8] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang,
Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering
the game of go without human knowledge. nature, 550(7676):354–359, 2017.
[9] Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient
methods for reinforcement learning with function approximation. Advances in neural
information processing systems, 12, 1999.
[10] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with
double q-learning. In Proceedings of the AAAI conference on artificial intelligence,
volume 30, 2016.
[11] Bayya Yegnanarayana. Artificial neural networks. PHI Learning Pvt. Ltd., 2009.
[12] Özal Yıldırım, Paweł Pławiak, Ru-San Tan, and U Rajendra Acharya. Arrhythmia
detection using deep convolutional neural network with long duration ecg signals.
Computers in biology and medicine, 102:411–420, 2018.
描述 碩士
國立政治大學
應用數學系
109751009
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109751009
資料類型 thesis
dc.contributor.advisor 蔡炎龍zh_TW
dc.contributor.advisor Tsai, Yen-Lungen_US
dc.contributor.author (Authors) 柯元富zh_TW
dc.contributor.author (Authors) Ko, Yuan-Fuen_US
dc.creator (作者) 柯元富zh_TW
dc.creator (作者) Ko, Yuan-Fuen_US
dc.date (日期) 2022en_US
dc.date.accessioned 1-Feb-2023 13:51:24 (UTC+8)-
dc.date.available 1-Feb-2023 13:51:24 (UTC+8)-
dc.date.issued (上傳時間) 1-Feb-2023 13:51:24 (UTC+8)-
dc.identifier (Other Identifiers) G0109751009en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/143172-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 應用數學系zh_TW
dc.description (描述) 109751009zh_TW
dc.description.abstract (摘要) 本篇文章使用強化學習與深度學習結合,打造股市自動交易系統。除
了股市中的原始資料外,也加入了一些投資者常用的技術指標,給定前 10
天的資料並使用全連接神經網路以及 Q 學習去訓練系統。
訓練系統時,分了兩組來訓練。第一組,把台灣 50 全部的成分股做為
訓練資料,並測試其往後 2 年的表現;第二組,取台灣 50 中的 9 支電子股做為訓練資料,並測試其往後 2 年的表現。實驗結果顯示,第一組訓練成果與買入持有策略相比並無明顯差異,而第二組的表現明顯優於買入持有策略。
實驗結果證明,DQN 模型於特定情況下在股市自動交易系統會是有效
的。
zh_TW
dc.description.abstract (摘要) This article uses a combination of reinforcement learning and deep learning to create an automated stock trading system. In addition to the original data from the stock market, some technical indicators commonly used by investors are also added to the system.
When training the system, we divided it into two groups. In the first group, all constituent stocks of the Taiwan 50 were used as training data and their performance
was tested for the next 2 years. In the second group, 9 electronic stocks in the Taiwan 50 were used as training data and tested their performance for the next 2
years. The results show that there is no significant difference between the first group and the buy-and-hold strategy, while the second group significantly outperforms the buy-and-hold strategy.
The experimental results demonstrate that the DQN model is effective in certain situations in the automated stock trading system.
en_US
dc.description.tableofcontents 致謝 ii
中文摘要 iii
Abstract iv
Contents v
List of Figures vii
1 Introduction 1
2 Related work 2
3 Deep Learning 3
3.1 Deep Learning 3
3.2 Neurons and Neural Network 4
3.3 Activation Function 7
3.4 Loss Function 10
3.5 Gradient Descent 12
4 Reinforcement Learning 14
4.1 Reinforcement Learning Framework 14
4.2 Markov Decision Processes 16
4.3 Monte Carlo Method and Temporal Difference 18
5 Deep Reinforcement Learning 19
5.1 Q-Learning 19
5.2 Deep Q-Learning Network(DQN) 20
5.3 Tips For Training Q-Learning Network 21
5.4 Policy Gradient 23
6 Automated Stock Trading System 25
6.1 data preparation 25
6.1.1 two data sets 25
6.1.2 Features of data 26
6.1.3 Normalization 28
6.2 Trading System Settlement 29
6.3 Initial Parameter Settlement 30
6.4 Neural Network 30
6.5 Result 30
7 Conclusion and Future work 35
Appendix A 附錄編輯 36
A.1 附錄內容 36
Bibliography 37
zh_TW
dc.format.extent 1040318 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109751009en_US
dc.subject (關鍵詞) 深度學習zh_TW
dc.subject (關鍵詞) 強化學習zh_TW
dc.subject (關鍵詞) Q學習zh_TW
dc.subject (關鍵詞) 股票自動交易系統zh_TW
dc.subject (關鍵詞) Deep Learningen_US
dc.subject (關鍵詞) Reinforcement Learningen_US
dc.subject (關鍵詞) Q-Learningen_US
dc.subject (關鍵詞) Automated Stock Trading Systemen_US
dc.title (題名) Double DQN 模型應用於自動股票交易系統zh_TW
dc.title (題名) Application of DDQN Model in automated stock trading systemen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] D. Z. Anderson, C. Benkert, and D. D. Crouch. Neural Networks for Perception. Neural
networks for perception /.
[2] Huang Guan Chi. Double q-network in automated stock trading. 2021.
[3] M. Lai. Playing atari with deep reinforcement learning.
[4] Jae Won Lee, Jonghun Park, O Jangmin, Jongwoo Lee, and Euyseok Hong. A multiagent
approach to q-learning for daily stock trading. IEEE Transactions on Systems, Man, and
Cybernetics-Part A: Systems and Humans, 37(6):864–877, 2007.
[5] Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayer
feedforward networks with a nonpolynomial activation function can approximate any
function. Neural networks, 6(6):861–867, 1993.
[6] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval
Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement
learning. arXiv preprint arXiv:1509.02971, 2015.
[7] Jerome H Saltzer, David P Reed, and David D Clark. End-to-end arguments in system
design. ACM Transactions on Computer Systems (TOCS), 2(4):277–288, 1984.
[8] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang,
Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering
the game of go without human knowledge. nature, 550(7676):354–359, 2017.
[9] Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient
methods for reinforcement learning with function approximation. Advances in neural
information processing systems, 12, 1999.
[10] Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with
double q-learning. In Proceedings of the AAAI conference on artificial intelligence,
volume 30, 2016.
[11] Bayya Yegnanarayana. Artificial neural networks. PHI Learning Pvt. Ltd., 2009.
[12] Özal Yıldırım, Paweł Pławiak, Ru-San Tan, and U Rajendra Acharya. Arrhythmia
detection using deep convolutional neural network with long duration ecg signals.
Computers in biology and medicine, 102:411–420, 2018.
zh_TW