使用多頭注意力機制的多重輸入 LSTM 股價預測模型 | Publication

Publications-Theses

Article View/Open

pdf(15)

Publication Export

Google Scholar^TM

Title	使用多頭注意力機制的多重輸入 LSTM 股價預測模型 A Multi-input LSTM model for Stock Price Prediction using Multi-head Self-Attention
Creator	葉國毅 Yeh, Kuo-Yi
Contributor	彭彥璁 Peng, Yan-Tsung 葉國毅 Yeh, Kuo-Yi
Key Words	股價預測深度學習長短期記憶注意力機制 Stock Price Prediction Deep Learning LSTM Attention
Date	2024
Date Issued	1-Mar-2024 14:11:40 (UTC+8)
Summary	隨著機器學習在不同領域的成功應用，機器學習在金融市場的應用越來越蓬勃發展，預測股價在這當中也是一個受到關注的研究領域。透過機器學習的技術和方法，我們可以利用大量的市場數據、技術指標和其他多元資料來建立預測模型，如此不僅能夠識別和分析股價的趨勢和模式，同時能夠考慮到多種相關因素的影響，例如財務數據、市場情緒、新聞事件等，這種綜合考慮的方式使得預測模型能夠更準確地捕捉市場的變化，並提供投資者有價值的信息和建議，以幫助投資者做出更明智的股票交易決策。本研究以台灣50指數的成分股作為訓練資料，透過皮爾森相關係數找出當日與標的股價相關的證券以及基準指數作為多重輸入特徵，結合注意力機制與LSTM建構股價預測模型，並應用模型預測值作為交易訊號進行交易績效回測。研究結果顯示多重的輸入資料以及注意力機制相對於其他基礎模型有更好的預測效果，在交易績效回測上相對於持有到期的方法，更能夠規避風險，穩定獲得正報酬。 Machine learning has been successfully applied in various fields, including financial markets. Among them, stock price prediction has become a prominent research field. By utilizing machine learning techniques, such as analyzing large amounts of market data, technical indicators and other diverse information, we can build accurate predictive models.These models consider features such as financial data, market sentiment, and news events to capture market changes.Providing investors with valuable insights that enable them to make informed trading decisions. This study uses multiple input data of Taiwan 50 Index constituent stocks for training. We use the Pearson correlation coefficient to find relationships between stocks, benchmark indexes, and target prices as input features. We combines an attention mechanism with LSTM to predict stock prices and uses these predictions as trading signals for backtesting. The results show that using multiple inputs and attention mechanism outperforms other baseline models. In trading backtesting, the model achieved more positive returns than a simple buy-and-hold strategy.
參考文獻	[1] J. J. Murphy, “Technical analysis of the financial markets: A comprehensive guide to trading methods and applications.” New York: New York Institute of Finance., 1999. [2] M. Lam, “Neural network techniques for financial performance prediction: integrating fundamental and technical analysis,” Decision Support Systems, 2004. [3] S. S. Groth and J. Muntermann, “An intraday market risk management approach based on textual analysis,” Decision Support Systems, 2011. [4] X. Liang, R.-C. Chen, Y. He, and Y. Chen, “Associating stock prices with web financial information time series based on support vector regression,” Neurocomputing, 2013. [5] A. Nikfarjam, E. Emadzadeh, and S. Muthaiyah, “Text mining approaches for stock market prediction,” in 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), 2010. [6] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes, “Correlating financial time series with micro-blogging activity,” in WSDM ’12, 2012. [7] B. Wang, H. Huang, and X. Wang, “A novel text mining approach to financial time series forecasting,” Neurocomputing, 2012. [8] L. A. Teixeira and A. L. I. de Oliveira, “A method for automatic stock trading combining technical analysis and nearest neighbor classification,” Expert Systems with Applications, 2010. [9] Y.-H. Lui and D. Mole, “The use of fundamental and technical analyses by foreign exchange dealers: Hong kong evidence,” Journal of International Money and Finance, 1998. [10] P. Ravi Kumar and V. Ravi, “Bankruptcy prediction in banks and firms via statistical and intelligent techniques–a review,” European Journal of Operational Research, 2007. [11] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” 2015. [12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [13] H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input lstm,” in Proceedings of The 10th Asian Conference on Machine Learning, 2018. [14] Y.-H. Chou, S.-Y. Kuo, and Y.-C. Jiang, “A novel portfolio optimization model based on trend ratio and evolutionary computation,” IEEE Transactions on Emerging Topics in Computational Intelligence, 2019. [15] X. Fu, J. Du, Y. Guo, M. Liu, T. Dong, and X. Duan, “A Machine Learning Framework for Stock Selection,” arXiv e-prints, 2018. [16] P. J. Coen, E. D. Gomme, and M. G. Kendall, “Lagged relationships in economic forecasting,” Journal of the Royal Statistical Society. Series A (General), 1969. [17] Y. Chen, W. Lin, and J. Z. Wang, “A dual-attention-based stock price trend prediction model with dual features,” IEEE Access, 2019. [18] W. Lu, J. Li, J. Wang, and L. Qin, “A cnn-bilstm-am method for stock price prediction,” Neural Computing and Applications, 2020. [19] W. Yeoh, Y. Jhang, S. Kuo, and Y. Chou, “Automatic stock trading system combined with short selling using moving average and gqts algorithm,” in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018. [20] H. Yang, X. Liu, and Q. Wu, “A practical machine learning approach for dynamic stock recommendation,” in 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2018. [21] Z. Xiong, X. Liu, S. Zhong, H. Yang, and A. Walid, “Practical deep reinforcement learning approach for stock trading,” CoRR, 2018. [22] Y.-Y. Song and L. Ying, “Decision tree methods: applications for classification and prediction,” Shanghai archives of psychiatry, 2015. [23] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, “Classification and regression trees,” in Routledge, 1984. [24] L. Breiman, “Random forests,” Mach. Learn., 2001. [25] T. K. Ho, “Random decision forests,” in Proceedings of 3rd international conference on document analysis and recognition, 1995. [26] L. Breiman, “Bagging predictors,” Machine Learning, 2004. [27] K. Kirasich, T. Smith, and B. Sadler, “Random forest vs logistic regression: Binary classification for heterogeneous datasets,” SMU Data Science Review, 2018. [28] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, 2015. [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012. [30] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the National Academy of Sciences, 1982. [31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, 2000. [33] A. Graves, A. Mohamed, and G. E. Hinton, “Speech recognition with deep recurrent neural networks,” CoRR, 2013. [34] Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, L. Jones, Gomez, A. N, Kaiser, L. ukasz, Polosukhin, and Illia, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017. [35] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-theart natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020. [36] K. L. Jacob Devlin, Ming-Wei Chang and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, 2018. [37] T. Gneiting, “Making and evaluating point forecasts,” Journal of the American Statistical Association, 2011. [38] A. Graves, A. rahman Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” 2013. [39] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” CoRR, 2014.
Description	碩士國立政治大學資訊科學系碩士在職專班 108971025
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0108971025
Type	thesis

dc.contributor.advisor	彭彥璁	zh_TW
dc.contributor.advisor	Peng, Yan-Tsung	en_US
dc.contributor.author (Authors)	葉國毅	zh_TW
dc.contributor.author (Authors)	Yeh, Kuo-Yi	en_US
dc.creator (作者)	葉國毅	zh_TW
dc.creator (作者)	Yeh, Kuo-Yi	en_US
dc.date (日期)	2024	en_US
dc.date.accessioned	1-Mar-2024 14:11:40 (UTC+8)	-
dc.date.available	1-Mar-2024 14:11:40 (UTC+8)	-
dc.date.issued (上傳時間)	1-Mar-2024 14:11:40 (UTC+8)	-
dc.identifier (Other Identifiers)	G0108971025	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/150259	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊科學系碩士在職專班	zh_TW
dc.description (描述)	108971025	zh_TW
dc.description.abstract (摘要)	隨著機器學習在不同領域的成功應用，機器學習在金融市場的應用越來越蓬勃發展，預測股價在這當中也是一個受到關注的研究領域。透過機器學習的技術和方法，我們可以利用大量的市場數據、技術指標和其他多元資料來建立預測模型，如此不僅能夠識別和分析股價的趨勢和模式，同時能夠考慮到多種相關因素的影響，例如財務數據、市場情緒、新聞事件等，這種綜合考慮的方式使得預測模型能夠更準確地捕捉市場的變化，並提供投資者有價值的信息和建議，以幫助投資者做出更明智的股票交易決策。本研究以台灣50指數的成分股作為訓練資料，透過皮爾森相關係數找出當日與標的股價相關的證券以及基準指數作為多重輸入特徵，結合注意力機制與LSTM建構股價預測模型，並應用模型預測值作為交易訊號進行交易績效回測。研究結果顯示多重的輸入資料以及注意力機制相對於其他基礎模型有更好的預測效果，在交易績效回測上相對於持有到期的方法，更能夠規避風險，穩定獲得正報酬。	zh_TW
dc.description.abstract (摘要)	Machine learning has been successfully applied in various fields, including financial markets. Among them, stock price prediction has become a prominent research field. By utilizing machine learning techniques, such as analyzing large amounts of market data, technical indicators and other diverse information, we can build accurate predictive models.These models consider features such as financial data, market sentiment, and news events to capture market changes.Providing investors with valuable insights that enable them to make informed trading decisions. This study uses multiple input data of Taiwan 50 Index constituent stocks for training. We use the Pearson correlation coefficient to find relationships between stocks, benchmark indexes, and target prices as input features. We combines an attention mechanism with LSTM to predict stock prices and uses these predictions as trading signals for backtesting. The results show that using multiple inputs and attention mechanism outperforms other baseline models. In trading backtesting, the model achieved more positive returns than a simple buy-and-hold strategy.	en_US
dc.description.tableofcontents	第一章緒論 1 第一節研究背景 1 第二節研究動機及目的 3 第二章文獻探討 6 第一節股票市場研究 6 第二節研究方法 11 第三章研究方法 19 第一節資料集 19 第二節資料處理 19 第三節模型架構 23 第四章實驗結果 26 第一節衡量指標 26 第二節實驗結果 28 第三節價格預測 30 第四節交易績效 36 第五節研究限制與未來方向 41 第五章結論 42 參考文獻 43	zh_TW
dc.format.extent	3121390 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0108971025	en_US
dc.subject (關鍵詞)	股價預測	zh_TW
dc.subject (關鍵詞)	深度學習	zh_TW
dc.subject (關鍵詞)	長短期記憶	zh_TW
dc.subject (關鍵詞)	注意力機制	zh_TW
dc.subject (關鍵詞)	Stock Price Prediction	en_US
dc.subject (關鍵詞)	Deep Learning	en_US
dc.subject (關鍵詞)	LSTM	en_US
dc.subject (關鍵詞)	Attention	en_US
dc.title (題名)	使用多頭注意力機制的多重輸入 LSTM 股價預測模型	zh_TW
dc.title (題名)	A Multi-input LSTM model for Stock Price Prediction using Multi-head Self-Attention	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] J. J. Murphy, “Technical analysis of the financial markets: A comprehensive guide to trading methods and applications.” New York: New York Institute of Finance., 1999. [2] M. Lam, “Neural network techniques for financial performance prediction: integrating fundamental and technical analysis,” Decision Support Systems, 2004. [3] S. S. Groth and J. Muntermann, “An intraday market risk management approach based on textual analysis,” Decision Support Systems, 2011. [4] X. Liang, R.-C. Chen, Y. He, and Y. Chen, “Associating stock prices with web financial information time series based on support vector regression,” Neurocomputing, 2013. [5] A. Nikfarjam, E. Emadzadeh, and S. Muthaiyah, “Text mining approaches for stock market prediction,” in 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), 2010. [6] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes, “Correlating financial time series with micro-blogging activity,” in WSDM ’12, 2012. [7] B. Wang, H. Huang, and X. Wang, “A novel text mining approach to financial time series forecasting,” Neurocomputing, 2012. [8] L. A. Teixeira and A. L. I. de Oliveira, “A method for automatic stock trading combining technical analysis and nearest neighbor classification,” Expert Systems with Applications, 2010. [9] Y.-H. Lui and D. Mole, “The use of fundamental and technical analyses by foreign exchange dealers: Hong kong evidence,” Journal of International Money and Finance, 1998. [10] P. Ravi Kumar and V. Ravi, “Bankruptcy prediction in banks and firms via statistical and intelligent techniques–a review,” European Journal of Operational Research, 2007. [11] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” 2015. [12] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [13] H. Li, Y. Shen, and Y. Zhu, “Stock price prediction using attention-based multi-input lstm,” in Proceedings of The 10th Asian Conference on Machine Learning, 2018. [14] Y.-H. Chou, S.-Y. Kuo, and Y.-C. Jiang, “A novel portfolio optimization model based on trend ratio and evolutionary computation,” IEEE Transactions on Emerging Topics in Computational Intelligence, 2019. [15] X. Fu, J. Du, Y. Guo, M. Liu, T. Dong, and X. Duan, “A Machine Learning Framework for Stock Selection,” arXiv e-prints, 2018. [16] P. J. Coen, E. D. Gomme, and M. G. Kendall, “Lagged relationships in economic forecasting,” Journal of the Royal Statistical Society. Series A (General), 1969. [17] Y. Chen, W. Lin, and J. Z. Wang, “A dual-attention-based stock price trend prediction model with dual features,” IEEE Access, 2019. [18] W. Lu, J. Li, J. Wang, and L. Qin, “A cnn-bilstm-am method for stock price prediction,” Neural Computing and Applications, 2020. [19] W. Yeoh, Y. Jhang, S. Kuo, and Y. Chou, “Automatic stock trading system combined with short selling using moving average and gqts algorithm,” in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018. [20] H. Yang, X. Liu, and Q. Wu, “A practical machine learning approach for dynamic stock recommendation,” in 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 2018. [21] Z. Xiong, X. Liu, S. Zhong, H. Yang, and A. Walid, “Practical deep reinforcement learning approach for stock trading,” CoRR, 2018. [22] Y.-Y. Song and L. Ying, “Decision tree methods: applications for classification and prediction,” Shanghai archives of psychiatry, 2015. [23] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, “Classification and regression trees,” in Routledge, 1984. [24] L. Breiman, “Random forests,” Mach. Learn., 2001. [25] T. K. Ho, “Random decision forests,” in Proceedings of 3rd international conference on document analysis and recognition, 1995. [26] L. Breiman, “Bagging predictors,” Machine Learning, 2004. [27] K. Kirasich, T. Smith, and B. Sadler, “Random forest vs logistic regression: Binary classification for heterogeneous datasets,” SMU Data Science Review, 2018. [28] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, 2015. [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012. [30] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities.” Proceedings of the National Academy of Sciences, 1982. [31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, 1997. [32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to Forget: Continual Prediction with LSTM,” Neural Computation, 2000. [33] A. Graves, A. Mohamed, and G. E. Hinton, “Speech recognition with deep recurrent neural networks,” CoRR, 2013. [34] Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, L. Jones, Gomez, A. N, Kaiser, L. ukasz, Polosukhin, and Illia, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017. [35] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush, “Transformers: State-of-theart natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020. [36] K. L. Jacob Devlin, Ming-Wei Chang and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR, 2018. [37] T. Gneiting, “Making and evaluating point forecasts,” Journal of the American Statistical Association, 2011. [38] A. Graves, A. rahman Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” 2013. [39] K. Cho, B. van Merrienboer, Ç. Gülçehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” CoRR, 2014.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM