Publications-Theses
Article View/Open
Publication Export
-
題名 應用樹狀演算法預測高頻數據下的波動度-以台灣股票市場為例
Forecasting Volatility under High Frequency Data with Tree-based Algorithm:Evidence from the Taiwan Stock Market作者 范姜峻浩
Fan Jiang, Jun-Hao貢獻者 岳夢蘭
Yueh, Meng-Lan
范姜峻浩
Fan Jiang, Jun-Hao關鍵詞 真實波動度
HAR-RV模型
波動度預測
機器學習
決策樹
隨機森林
Realized Volatility
HAR-RV Model
Volatility Forecasting
Machine Learning
Decision Tree
Random Forest日期 2022 上傳時間 1-Aug-2022 17:19:02 (UTC+8) 摘要 本篇論文使用5分鐘高頻資料所計算的真實波動度來當作市場波動度的衡量,以HAR-RV (Corsi, 2009)模型及其衍生模型為架構,應用機器學習技術中的樹狀演算法,預測台灣股市中三個指數的波動度,探討是否能透過捕捉變數間的非線性關係來改善線性模型的樣本外預測表現。除了真實波動度值的預測外,我們也關心波動度方向的準確度,提出使用樹狀演算法來預測未來波動度變大或變小兩種情況的分類模型。我們的結果顯示,隨機森林演算法在大多數的情況都能取得良好於其他模型的預測表現,在所有時間水平的預測下皆取得了較低的RMSE (Root Mean Square Error)及MAE (Mean Absolute Error),大部分的情況下也有較低的MAPE (Mean Absolute Percentage Error),Diebold-Mariano檢定結果也顯示隨機森林演算法有顯著優於線性模型的預測誤差,表示透過非線性的樹狀演算法進行模型擬合確實能夠提高預測表現。此外,研究結果也發現我們的模型在預測電子類指數之波動度時表現較佳,獲得相較於其他兩個指數更低的MAPE,並且在方向的預測上也有較高的準確度 (Accuracy)及精確度 (Precision)。
This paper provides an evaluation of forecasting performance of the realized volatility calculated with 5-minute frequency data. Based on HAR-RV (Corsi, 2009) series model, Tree-based algorithms, the machine learning approach, were used to predict the realized volatility of three indices in the Taiwan stock market. In the comparison with linear regression model, we investigated whether the predictive performance in out-of-samples can be improved by tree-based algorithms which could easily capture non-linear relationships between variables. In addition to forecasting the value of realized volatility, we also considered the accuracy of the realized volatility direction, and propose a classification model that applied tree-based algorithms to predict the volatility directions, up and down. Our result showed that the random forest algorithm could achieve better predictive performance than other models in most cases. Lower RMSE and MAE at all time horizons of prediction were gained, and in most cases, there were also lower MAPE. The results of the Diebold-Mariano test also showed that the random forest had a significantly better predictive error than the linear model indicating that models fitted with non-linear tree-based algorithms could indeed improve the predictive performance. Furthermore, we also found that our model performed better in forecasting the volatility of the Taiwan Electronics Index, which obtained lower MAPE than the other two indices, and also had a higher accuracy and precision in the direction prediction.參考文獻 [1] Andersen, T. G., & Bollerslev, T. (1998). Answering the Skeptics: Yes, Standard Volatility Models do Provide Accurate Forecasts. International Economic Review, 39(4), 885–905.[2] Andersen, T.G., Bollerslev, T., Diebold, F.X. & Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 71, 579-625.[3] Andersen , T.G. & Teräsvirta, T. (2009). Realized Volatility. In: Mikosch, T., Kreiß, JP., Davis, R., Andersen, T. (eds) Handbook of Financial Time Series. Springer, Berlin, Heidelberg.[4] Andersen, T.G., Bollerslev, T. & Diebold, F.X. (2007). Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 89 (4), 701–720.[5] Barndorff-Nielsen, O.E., Shephard, N. (2004). Power and Bipower Variation with Stochastic Volatility and Jumps, Journal of Financial Econometrics, Volume 2, Issue 1 , 1–37.[6] Barndorff-Nielsen, O.E., Kinnebrock, S. & Shephard, N. (2008). Measuring Downside Risk - Realised Semivariance. CREATES Research Paper, No. 2008-42[7] Breiman, L. (2001). Random Forests. Machine Learning 45, 5–32.[8] Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification And Regression Trees (1st ed.). Routledge.[9] Browne, M.W. (2000), Cross-Validation Methods, Journal of Mathematical Psychology, Volume 44, Issue 1, 108-132.[10] Bouri, E., Gkillas, K., Gupta, R., & Pierdzioch, C. (2021). Forecasting realized volatility of bitcoin: The role of the trade war. Computational Economics, 57(1), 29-53.[11] Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, e623.[12] Christensen, K., Siggaard, M., & Veliyev, B. (2021). A machine learning approach to volatility forecasting. Available at SSRN.[13] Christoffersen, P. F., & Diebold, F. X. (2006). Financial asset returns, direction-of-change forecasting, and volatility dynamics. Management Science, 52(8), 1273-1287.[14] Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7(2), 174-196.[15] Corsi, F., & Renò, R. (2012). Discrete-time volatility forecasting with persistent leverage effect and the link with continuous-time volatility modeling. Journal of Business & Economic Statistics, 30(3), 368-380.[16] Degiannakis, S., & Filis, G. (2017). Forecasting oil price realized volatility using information channels from other asset classes. Journal of International Money and Finance, 76, 28-49.[17] De Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean absolute percentage error for regression models. Neurocomputing, 192, 38-48.[18] Engle R.F., Patton A.J. (2007), 2 - What good is a volatility model?*, Editor(s): John Knight, Stephen Satchell, In Quantitative Finance, Forecasting Volatility in the Financial Markets (Third Edition), Butterworth-Heinemann, 47-63.[19] Franses, P.H. & Van Dijk, D. (1996). Forecasting stock market volatility using (non-linear) Garch models. J. Forecast., 15: 229-235.[20] Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on knowledge and Data Engineering, 17(3), 299-310.[21] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.[22] James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: springer.[23] Liu, L. Y., Patton, A. J., & Sheppard, K. (2015). Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes. Journal of Econometrics, 187(1), 293-311.[24] Liu, L., & Zhang, T. (2015). Economic policy uncertainty and stock market volatility. Finance Research Letters, 15, 99-105.[25] Luong, C., & Dokuchaev, N. (2018). Forecasting of realised volatility with the random forests algorithm. Journal of Risk and Financial Management, 11(4), 61.[26] McAleer, M., & Medeiros, M. C. (2008). Realized volatility: A review. Econometric reviews, 27(1-3), 10-45.[27] Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D. (2004). An introduction to decision tree modeling. J. Chemometrics, 18: 275-285.[28] Patton , A.J. & Sheppard K. (2015). Good Volatility, Bad Volatility: Signed Jumps and The Persistence of Volatility. The Review of Economics and Statistics; 97 (3), 683–697.[29] Peng, Y., Albuquerque, P. H. M., de Sá, J. M. C., Padula, A. J. A., & Montenegro, M. R. (2018). The best of two worlds: Forecasting high frequency volatility for cryptocurrencies and traditional currencies with Support Vector Regression. Expert Systems with Applications, 97, 177-192.[30] Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery.[31] Trevor Hastie, Jerome Friedman, Robert Tibshirani (2001). The Elements of Statistical Learning: data mining, inference, and prediction. Springer New York, NY[32] Wen, F., Gong, X., & Cai, S. (2016). Forecasting the volatility of crude oil futures using HAR-type models with structural breaks. Energy Economics, 59, 400-413.[33] Zhou, Y., Li, T., Shi, J., & Qian, Z. (2019). A CEEMDAN and XGBOOST-based approach to forecast crude oil prices. Complexity, 201 描述 碩士
國立政治大學
財務管理學系
109357020資料來源 http://thesis.lib.nccu.edu.tw/record/#G0109357020 資料類型 thesis dc.contributor.advisor 岳夢蘭 zh_TW dc.contributor.advisor Yueh, Meng-Lan en_US dc.contributor.author (Authors) 范姜峻浩 zh_TW dc.contributor.author (Authors) Fan Jiang, Jun-Hao en_US dc.creator (作者) 范姜峻浩 zh_TW dc.creator (作者) Fan Jiang, Jun-Hao en_US dc.date (日期) 2022 en_US dc.date.accessioned 1-Aug-2022 17:19:02 (UTC+8) - dc.date.available 1-Aug-2022 17:19:02 (UTC+8) - dc.date.issued (上傳時間) 1-Aug-2022 17:19:02 (UTC+8) - dc.identifier (Other Identifiers) G0109357020 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/141022 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 財務管理學系 zh_TW dc.description (描述) 109357020 zh_TW dc.description.abstract (摘要) 本篇論文使用5分鐘高頻資料所計算的真實波動度來當作市場波動度的衡量,以HAR-RV (Corsi, 2009)模型及其衍生模型為架構,應用機器學習技術中的樹狀演算法,預測台灣股市中三個指數的波動度,探討是否能透過捕捉變數間的非線性關係來改善線性模型的樣本外預測表現。除了真實波動度值的預測外,我們也關心波動度方向的準確度,提出使用樹狀演算法來預測未來波動度變大或變小兩種情況的分類模型。我們的結果顯示,隨機森林演算法在大多數的情況都能取得良好於其他模型的預測表現,在所有時間水平的預測下皆取得了較低的RMSE (Root Mean Square Error)及MAE (Mean Absolute Error),大部分的情況下也有較低的MAPE (Mean Absolute Percentage Error),Diebold-Mariano檢定結果也顯示隨機森林演算法有顯著優於線性模型的預測誤差,表示透過非線性的樹狀演算法進行模型擬合確實能夠提高預測表現。此外,研究結果也發現我們的模型在預測電子類指數之波動度時表現較佳,獲得相較於其他兩個指數更低的MAPE,並且在方向的預測上也有較高的準確度 (Accuracy)及精確度 (Precision)。 zh_TW dc.description.abstract (摘要) This paper provides an evaluation of forecasting performance of the realized volatility calculated with 5-minute frequency data. Based on HAR-RV (Corsi, 2009) series model, Tree-based algorithms, the machine learning approach, were used to predict the realized volatility of three indices in the Taiwan stock market. In the comparison with linear regression model, we investigated whether the predictive performance in out-of-samples can be improved by tree-based algorithms which could easily capture non-linear relationships between variables. In addition to forecasting the value of realized volatility, we also considered the accuracy of the realized volatility direction, and propose a classification model that applied tree-based algorithms to predict the volatility directions, up and down. Our result showed that the random forest algorithm could achieve better predictive performance than other models in most cases. Lower RMSE and MAE at all time horizons of prediction were gained, and in most cases, there were also lower MAPE. The results of the Diebold-Mariano test also showed that the random forest had a significantly better predictive error than the linear model indicating that models fitted with non-linear tree-based algorithms could indeed improve the predictive performance. Furthermore, we also found that our model performed better in forecasting the volatility of the Taiwan Electronics Index, which obtained lower MAPE than the other two indices, and also had a higher accuracy and precision in the direction prediction. en_US dc.description.tableofcontents 第一章 緒論 1第二章 文獻回顧 5第一節 真實波動度預測 5第二節 機器學習方法應用 7第三章 研究方法 9第一節 變數定義 9第二節 模型定義 13第三節 樹狀演算法 17第四節 模型訓練方法 23第五節 預測誤差之測量方法 25第四章 實證分析 29第一節 資料描述與敘述統計 29第二節 極端值處理方法 34第三節 實證結果 35第五章 結論 52參考文獻 54附錄 57 zh_TW dc.format.extent 4835989 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0109357020 en_US dc.subject (關鍵詞) 真實波動度 zh_TW dc.subject (關鍵詞) HAR-RV模型 zh_TW dc.subject (關鍵詞) 波動度預測 zh_TW dc.subject (關鍵詞) 機器學習 zh_TW dc.subject (關鍵詞) 決策樹 zh_TW dc.subject (關鍵詞) 隨機森林 zh_TW dc.subject (關鍵詞) Realized Volatility en_US dc.subject (關鍵詞) HAR-RV Model en_US dc.subject (關鍵詞) Volatility Forecasting en_US dc.subject (關鍵詞) Machine Learning en_US dc.subject (關鍵詞) Decision Tree en_US dc.subject (關鍵詞) Random Forest en_US dc.title (題名) 應用樹狀演算法預測高頻數據下的波動度-以台灣股票市場為例 zh_TW dc.title (題名) Forecasting Volatility under High Frequency Data with Tree-based Algorithm:Evidence from the Taiwan Stock Market en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Andersen, T. G., & Bollerslev, T. (1998). Answering the Skeptics: Yes, Standard Volatility Models do Provide Accurate Forecasts. International Economic Review, 39(4), 885–905.[2] Andersen, T.G., Bollerslev, T., Diebold, F.X. & Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 71, 579-625.[3] Andersen , T.G. & Teräsvirta, T. (2009). Realized Volatility. In: Mikosch, T., Kreiß, JP., Davis, R., Andersen, T. (eds) Handbook of Financial Time Series. Springer, Berlin, Heidelberg.[4] Andersen, T.G., Bollerslev, T. & Diebold, F.X. (2007). Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 89 (4), 701–720.[5] Barndorff-Nielsen, O.E., Shephard, N. (2004). Power and Bipower Variation with Stochastic Volatility and Jumps, Journal of Financial Econometrics, Volume 2, Issue 1 , 1–37.[6] Barndorff-Nielsen, O.E., Kinnebrock, S. & Shephard, N. (2008). Measuring Downside Risk - Realised Semivariance. CREATES Research Paper, No. 2008-42[7] Breiman, L. (2001). Random Forests. Machine Learning 45, 5–32.[8] Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification And Regression Trees (1st ed.). Routledge.[9] Browne, M.W. (2000), Cross-Validation Methods, Journal of Mathematical Psychology, Volume 44, Issue 1, 108-132.[10] Bouri, E., Gkillas, K., Gupta, R., & Pierdzioch, C. (2021). Forecasting realized volatility of bitcoin: The role of the trade war. Computational Economics, 57(1), 29-53.[11] Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, e623.[12] Christensen, K., Siggaard, M., & Veliyev, B. (2021). A machine learning approach to volatility forecasting. Available at SSRN.[13] Christoffersen, P. F., & Diebold, F. X. (2006). Financial asset returns, direction-of-change forecasting, and volatility dynamics. Management Science, 52(8), 1273-1287.[14] Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7(2), 174-196.[15] Corsi, F., & Renò, R. (2012). Discrete-time volatility forecasting with persistent leverage effect and the link with continuous-time volatility modeling. Journal of Business & Economic Statistics, 30(3), 368-380.[16] Degiannakis, S., & Filis, G. (2017). Forecasting oil price realized volatility using information channels from other asset classes. Journal of International Money and Finance, 76, 28-49.[17] De Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean absolute percentage error for regression models. Neurocomputing, 192, 38-48.[18] Engle R.F., Patton A.J. (2007), 2 - What good is a volatility model?*, Editor(s): John Knight, Stephen Satchell, In Quantitative Finance, Forecasting Volatility in the Financial Markets (Third Edition), Butterworth-Heinemann, 47-63.[19] Franses, P.H. & Van Dijk, D. (1996). Forecasting stock market volatility using (non-linear) Garch models. J. Forecast., 15: 229-235.[20] Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on knowledge and Data Engineering, 17(3), 299-310.[21] Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.[22] James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: springer.[23] Liu, L. Y., Patton, A. J., & Sheppard, K. (2015). Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes. Journal of Econometrics, 187(1), 293-311.[24] Liu, L., & Zhang, T. (2015). Economic policy uncertainty and stock market volatility. Finance Research Letters, 15, 99-105.[25] Luong, C., & Dokuchaev, N. (2018). Forecasting of realised volatility with the random forests algorithm. Journal of Risk and Financial Management, 11(4), 61.[26] McAleer, M., & Medeiros, M. C. (2008). Realized volatility: A review. Econometric reviews, 27(1-3), 10-45.[27] Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D. (2004). An introduction to decision tree modeling. J. Chemometrics, 18: 275-285.[28] Patton , A.J. & Sheppard K. (2015). Good Volatility, Bad Volatility: Signed Jumps and The Persistence of Volatility. The Review of Economics and Statistics; 97 (3), 683–697.[29] Peng, Y., Albuquerque, P. H. M., de Sá, J. M. C., Padula, A. J. A., & Montenegro, M. R. (2018). The best of two worlds: Forecasting high frequency volatility for cryptocurrencies and traditional currencies with Support Vector Regression. Expert Systems with Applications, 97, 177-192.[30] Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery.[31] Trevor Hastie, Jerome Friedman, Robert Tibshirani (2001). The Elements of Statistical Learning: data mining, inference, and prediction. Springer New York, NY[32] Wen, F., Gong, X., & Cai, S. (2016). Forecasting the volatility of crude oil futures using HAR-type models with structural breaks. Energy Economics, 59, 400-413.[33] Zhou, Y., Li, T., Shi, J., & Qian, Z. (2019). A CEEMDAN and XGBOOST-based approach to forecast crude oil prices. Complexity, 201 zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202200887 en_US
