Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 使用多頭注意力機制的多重輸入 LSTM 股價預測模型
A Multi-input LSTM model for Stock Price Prediction using Multi-head Self-Attention
作者 葉國毅
Yeh, Kuo-Yi
貢獻者 彭彥璁
Peng, Yan-Tsung
葉國毅
Yeh, Kuo-Yi
關鍵詞 股價預測
深度學習
長短期記憶
注意力機制
Stock Price Prediction
Deep Learning
LSTM
Attention
日期 2024
上傳時間 1-Mar-2024 14:11:40 (UTC+8)
摘要 隨著機器學習在不同領域的成功應用,機器學習在金融市場的應用越來越蓬勃發展,預測股價在這當中也是一個受到關注的研究領域。透過機器學習的技術和方法,我們可以利用大量的市場數據、技術指標和其他多元資料來建立預測模型,如此不僅能夠識別和分析股價的趨勢和模式,同時能夠考慮到多種相關因素的影響,例如財務數據、市場情緒、新聞事件等,這種綜合考慮的方式使得預測模型能夠更準確地捕捉市場的變化,並提供投資者有價值的信息和建議,以幫助投資者做出更明智的股票交易決策。 本研究以台灣50指數的成分股作為訓練資料,透過皮爾森相關係數找出當日與標的股價相關的證券以及基準指數作為多重輸入特徵,結合注意力機制與LSTM建構股價預測模型,並應用模型預測值作為交易訊號進行交易績效回測。研究結果顯示多重的輸入資料以及注意力機制相對於其他基礎模型有更好的預測效果,在交易績效回測上相對於持有到期的方法,更能夠規避風險,穩定獲得正報酬。
Machine learning has been successfully applied in various fields, including financial markets. Among them, stock price prediction has become a prominent research field. By utilizing machine learning techniques, such as analyzing large amounts of market data, technical indicators and other diverse information, we can build accurate predictive models.These models consider features such as financial data, market sentiment, and news events to capture market changes.Providing investors with valuable insights that enable them to make informed trading decisions. This study uses multiple input data of Taiwan 50 Index constituent stocks for training. We use the Pearson correlation coefficient to find relationships between stocks, benchmark indexes, and target prices as input features. We combines an attention mechanism with LSTM to predict stock prices and uses these predictions as trading signals for backtesting. The results show that using multiple inputs and attention mechanism outperforms other baseline models. In trading backtesting, the model achieved more positive returns than a simple buy-and-hold strategy.
參考文獻 [1] J. J. Murphy, “Technical analysis of the financial markets: A comprehensive guide to trading methods and applications.” New York: New York Institute of Finance., 1999. [2] M. Lam, “Neural network techniques for financial performance prediction: integrating fundamental and technical analysis,” Decision Support Systems, 2004. [3] S. S. Groth and J. Muntermann, “An intraday market risk management approach based on textual analysis,” Decision Support Systems, 2011. [4] X. Liang, R.-C. Chen, Y. He, and Y. Chen, “Associating stock prices with web financial information time series based on support vector regression,” Neurocomputing, 2013. [5] A. Nikfarjam, E. Emadzadeh, and S. Muthaiyah, “Text mining approaches for stock market prediction,” in 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), 2010. [6] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes, “Correlating financial time series with micro-blogging activity,” in WSDM ’12, 2012. [7] B. Wang, H. Huang, and X. Wang, “A novel text mining approach to financial time series forecasting,” Neurocomputing, 2012. [8] L. A. Teixeira and A. L. I. de Oliveira, “A method for automatic stock trading combining technical analysis and nearest neighbor classification,” Expert Systems with Applications, 2010. [9] Y.-H. Lui and D. Mole, “The use of fundamental and technical analyses by foreign exchange dealers: Hong kong evidence,” Journal of International Money and Finance, 1998. [10] P. Ravi Kumar and V. Ravi, “Bankruptcy prediction in banks and firms via statistical and intelligent techniques–a review,” European Journal of Operational Research, 2007. [11] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” 2015. [12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [13] H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input lstm,” in Proceedings of The 10th Asian Conference on Machine Learning, 2018. [14] Y.-H. Chou, S.-Y. Kuo, and Y.-C. Jiang, “A novel portfolio optimization model based on trend ratio and evolutionary computation,” IEEE Transactions on Emerging Topics in Computational Intelligence, 2019. [15] X. Fu, J. Du, Y. Guo, M. Liu, T. Dong, and X. Duan, “A Machine Learning Framework for Stock Selection,” arXiv e-prints, 2018. [16] P. J. Coen, E. D. Gomme, and M. G. Kendall, “Lagged relationships in economic forecasting,” Journal of the Royal Statistical Society. Series A (General), 1969. [17] Y. Chen, W. Lin, and J. Z. Wang, “A dual-attention-based stock price trend prediction model with dual features,” IEEE Access, 2019. [18] W. Lu, J. Li, J. Wang, and L. Qin, “A cnn-bilstm-am method for stock price prediction,” Neural Computing and Applications, 2020. [19] W. Yeoh, Y. Jhang, S. Kuo, and Y. Chou, “Automatic stock trading system combined with short selling using moving average and gqts algorithm,” in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018. [20] H. Yang, X. Liu, and Q. Wu, “A practical machine learning approach for dynamic stock recommendation,” in 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2018. [21] Z. Xiong, X. Liu, S. Zhong, H. Yang, and A. Walid, “Practical deep reinforcement learning approach for stock trading,” CoRR, 2018. [22] Y.-Y. Song and L. Ying, “Decision tree methods: applications for classification and prediction,” Shanghai archives of psychiatry, 2015. [23] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, “Classification and regression trees,” in Routledge, 1984. [24] L. Breiman, “Random forests,” Mach. Learn., 2001. [25] T. K. Ho, “Random decision forests,” in Proceedings of 3rd international conference on document analysis and recognition, 1995. [26] L. Breiman, “Bagging predictors,” Machine Learning, 2004. [27] K. Kirasich, T. Smith, and B. Sadler, “Random forest vs logistic regression: Binary classification for heterogeneous datasets,” SMU Data Science Review, 2018. [28] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, 2015. [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012. [30] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the National Academy of Sciences, 1982. [31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, 2000. [33] A. Graves, A. Mohamed, and G. E. Hinton, “Speech recognition with deep recurrent neural networks,” CoRR, 2013. [34] Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, L. Jones, Gomez, A. N, Kaiser, L. ukasz, Polosukhin, and Illia, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017. [35] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-theart natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020. [36] K. L. Jacob Devlin, Ming-Wei Chang and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, 2018. [37] T. Gneiting, “Making and evaluating point forecasts,” Journal of the American Statistical Association, 2011. [38] A. Graves, A. rahman Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” 2013. [39] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” CoRR, 2014.
描述 碩士
國立政治大學
資訊科學系碩士在職專班
108971025
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0108971025
資料類型 thesis
dc.contributor.advisor 彭彥璁zh_TW
dc.contributor.advisor Peng, Yan-Tsungen_US
dc.contributor.author (Authors) 葉國毅zh_TW
dc.contributor.author (Authors) Yeh, Kuo-Yien_US
dc.creator (作者) 葉國毅zh_TW
dc.creator (作者) Yeh, Kuo-Yien_US
dc.date (日期) 2024en_US
dc.date.accessioned 1-Mar-2024 14:11:40 (UTC+8)-
dc.date.available 1-Mar-2024 14:11:40 (UTC+8)-
dc.date.issued (上傳時間) 1-Mar-2024 14:11:40 (UTC+8)-
dc.identifier (Other Identifiers) G0108971025en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/150259-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學系碩士在職專班zh_TW
dc.description (描述) 108971025zh_TW
dc.description.abstract (摘要) 隨著機器學習在不同領域的成功應用,機器學習在金融市場的應用越來越蓬勃發展,預測股價在這當中也是一個受到關注的研究領域。透過機器學習的技術和方法,我們可以利用大量的市場數據、技術指標和其他多元資料來建立預測模型,如此不僅能夠識別和分析股價的趨勢和模式,同時能夠考慮到多種相關因素的影響,例如財務數據、市場情緒、新聞事件等,這種綜合考慮的方式使得預測模型能夠更準確地捕捉市場的變化,並提供投資者有價值的信息和建議,以幫助投資者做出更明智的股票交易決策。 本研究以台灣50指數的成分股作為訓練資料,透過皮爾森相關係數找出當日與標的股價相關的證券以及基準指數作為多重輸入特徵,結合注意力機制與LSTM建構股價預測模型,並應用模型預測值作為交易訊號進行交易績效回測。研究結果顯示多重的輸入資料以及注意力機制相對於其他基礎模型有更好的預測效果,在交易績效回測上相對於持有到期的方法,更能夠規避風險,穩定獲得正報酬。zh_TW
dc.description.abstract (摘要) Machine learning has been successfully applied in various fields, including financial markets. Among them, stock price prediction has become a prominent research field. By utilizing machine learning techniques, such as analyzing large amounts of market data, technical indicators and other diverse information, we can build accurate predictive models.These models consider features such as financial data, market sentiment, and news events to capture market changes.Providing investors with valuable insights that enable them to make informed trading decisions. This study uses multiple input data of Taiwan 50 Index constituent stocks for training. We use the Pearson correlation coefficient to find relationships between stocks, benchmark indexes, and target prices as input features. We combines an attention mechanism with LSTM to predict stock prices and uses these predictions as trading signals for backtesting. The results show that using multiple inputs and attention mechanism outperforms other baseline models. In trading backtesting, the model achieved more positive returns than a simple buy-and-hold strategy.en_US
dc.description.tableofcontents 第一章 緒論 1 第一節 研究背景 1 第二節 研究動機及目的 3 第二章 文獻探討 6 第一節 股票市場研究 6 第二節 研究方法 11 第三章 研究方法 19 第一節 資料集 19 第二節 資料處理 19 第三節 模型架構 23 第四章 實驗結果 26 第一節 衡量指標 26 第二節 實驗結果 28 第三節 價格預測 30 第四節 交易績效 36 第五節 研究限制與未來方向 41 第五章 結論 42 參考文獻 43zh_TW
dc.format.extent 3121390 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0108971025en_US
dc.subject (關鍵詞) 股價預測zh_TW
dc.subject (關鍵詞) 深度學習zh_TW
dc.subject (關鍵詞) 長短期記憶zh_TW
dc.subject (關鍵詞) 注意力機制zh_TW
dc.subject (關鍵詞) Stock Price Predictionen_US
dc.subject (關鍵詞) Deep Learningen_US
dc.subject (關鍵詞) LSTMen_US
dc.subject (關鍵詞) Attentionen_US
dc.title (題名) 使用多頭注意力機制的多重輸入 LSTM 股價預測模型zh_TW
dc.title (題名) A Multi-input LSTM model for Stock Price Prediction using Multi-head Self-Attentionen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] J. J. Murphy, “Technical analysis of the financial markets: A comprehensive guide to trading methods and applications.” New York: New York Institute of Finance., 1999. [2] M. Lam, “Neural network techniques for financial performance prediction: integrating fundamental and technical analysis,” Decision Support Systems, 2004. [3] S. S. Groth and J. Muntermann, “An intraday market risk management approach based on textual analysis,” Decision Support Systems, 2011. [4] X. Liang, R.-C. Chen, Y. He, and Y. Chen, “Associating stock prices with web financial information time series based on support vector regression,” Neurocomputing, 2013. [5] A. Nikfarjam, E. Emadzadeh, and S. Muthaiyah, “Text mining approaches for stock market prediction,” in 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), 2010. [6] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes, “Correlating financial time series with micro-blogging activity,” in WSDM ’12, 2012. [7] B. Wang, H. Huang, and X. Wang, “A novel text mining approach to financial time series forecasting,” Neurocomputing, 2012. [8] L. A. Teixeira and A. L. I. de Oliveira, “A method for automatic stock trading combining technical analysis and nearest neighbor classification,” Expert Systems with Applications, 2010. [9] Y.-H. Lui and D. Mole, “The use of fundamental and technical analyses by foreign exchange dealers: Hong kong evidence,” Journal of International Money and Finance, 1998. [10] P. Ravi Kumar and V. Ravi, “Bankruptcy prediction in banks and firms via statistical and intelligent techniques–a review,” European Journal of Operational Research, 2007. [11] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” 2015. [12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [13] H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input lstm,” in Proceedings of The 10th Asian Conference on Machine Learning, 2018. [14] Y.-H. Chou, S.-Y. Kuo, and Y.-C. Jiang, “A novel portfolio optimization model based on trend ratio and evolutionary computation,” IEEE Transactions on Emerging Topics in Computational Intelligence, 2019. [15] X. Fu, J. Du, Y. Guo, M. Liu, T. Dong, and X. Duan, “A Machine Learning Framework for Stock Selection,” arXiv e-prints, 2018. [16] P. J. Coen, E. D. Gomme, and M. G. Kendall, “Lagged relationships in economic forecasting,” Journal of the Royal Statistical Society. Series A (General), 1969. [17] Y. Chen, W. Lin, and J. Z. Wang, “A dual-attention-based stock price trend prediction model with dual features,” IEEE Access, 2019. [18] W. Lu, J. Li, J. Wang, and L. Qin, “A cnn-bilstm-am method for stock price prediction,” Neural Computing and Applications, 2020. [19] W. Yeoh, Y. Jhang, S. Kuo, and Y. Chou, “Automatic stock trading system combined with short selling using moving average and gqts algorithm,” in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018. [20] H. Yang, X. Liu, and Q. Wu, “A practical machine learning approach for dynamic stock recommendation,” in 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2018. [21] Z. Xiong, X. Liu, S. Zhong, H. Yang, and A. Walid, “Practical deep reinforcement learning approach for stock trading,” CoRR, 2018. [22] Y.-Y. Song and L. Ying, “Decision tree methods: applications for classification and prediction,” Shanghai archives of psychiatry, 2015. [23] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, “Classification and regression trees,” in Routledge, 1984. [24] L. Breiman, “Random forests,” Mach. Learn., 2001. [25] T. K. Ho, “Random decision forests,” in Proceedings of 3rd international conference on document analysis and recognition, 1995. [26] L. Breiman, “Bagging predictors,” Machine Learning, 2004. [27] K. Kirasich, T. Smith, and B. Sadler, “Random forest vs logistic regression: Binary classification for heterogeneous datasets,” SMU Data Science Review, 2018. [28] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, 2015. [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012. [30] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the National Academy of Sciences, 1982. [31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, 2000. [33] A. Graves, A. Mohamed, and G. E. Hinton, “Speech recognition with deep recurrent neural networks,” CoRR, 2013. [34] Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, L. Jones, Gomez, A. N, Kaiser, L. ukasz, Polosukhin, and Illia, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017. [35] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-theart natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020. [36] K. L. Jacob Devlin, Ming-Wei Chang and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, 2018. [37] T. Gneiting, “Making and evaluating point forecasts,” Journal of the American Statistical Association, 2011. [38] A. Graves, A. rahman Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” 2013. [39] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” CoRR, 2014.zh_TW