資產配置基於集成學習的多因子模型－以台灣股市為例

學術產出-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

政大圖書館

學術資源探索系統

Citation Infomation

Simple Record
Full Record

題名	資產配置基於集成學習的多因子模型－以台灣股市為例 Asset Allocation Based on Ensemble-Learning Assisted Multi-Factor Models– Taiwan Stock Market as an Example
作者	陳昱安 Chen, Yu-An
貢獻者	江彌修 Chiang, Mi-Hsiu 陳昱安 Chen, Yu-An
關鍵詞	因子選股機器學習集成學習隨機森林台股市場 Factors stock selection Machine learning Ensemble-learning XGBoost Random forest Taiwan stock market
日期	2020
上傳時間	3-Aug-2020 17:38:21 (UTC+8)
摘要	本研究使用了三種傳統多因子選股模型以及結合了因子選股的兩種集成學習法為基礎之機器學習分類模型Extreme Gradient Boosting (XGBoost)、Random Forest來建構選股模型，並且比較了傳統多因子選股模型以及機器學習因子選股模型之策略績效，同時觀察兩種機器學習模型之間預測效果以及策略績效的差異性。而本研究所採用之資產標的為台灣股票市場之上市股票，樣本回測期間採自2010/1/1 至2020/1/1之所有台灣上市股票，因子特徵選取方面撇除掉過去研究常用之基本面數據，採用價量面以及台灣市場獨有之籌碼面資料，實證結果顯示，兩種機器學習模型在樣本內回測期間大幅優於傳統多因子選股模型，而在樣本外回測期間策略績效表現亦較傳統多因子選股模型出色，顯示了機器學習模型挖掘出資產報酬趨勢的能力。仔細比較XGBoost和Random Forest的策略績效後，可以發現到XGBoost優於Random Forest，表示帶有懲罰項係數防止決策樹過度擬合之XGBoost模型在樣本外表現上優於Random Forest，完整體現了金融市場隨著時間不斷變化的特性，模型優化了過度擬合歷史數據的缺點，提升模型在樣本外的性能。 In this paper, we implement three traditional multi-factor stock selection models and two ensemble-learning models including Extreme Gradient Boosting (XGBoost) and Random Forest combined with stock factors. We compared the strategy performance of traditional factor stock selection model and ensemble-learning factor stock selection model, and the difference between the prediction effectiveness and strategy performance of the two ensemble-learning models. The assets used in this research are the listed stocks in Taiwan stock market and the backtesting period is from 2010/1/1 to 2020/1/1. The stock factors include technical data and the unique trading volume data of Taiwan stock market and exclude fundamental data that commonly used in the research before. The empirical results show that the two ensemble-learning models outperform the traditional multi-factor stock selection model during the whole period. This results prove the ability of ensemble-learning models to capture asset return trend. In the results, we can also find that XGBoost outperforms Random Forest, indicating the model with penalty coefficients to prevent over-fitting outperforms, fully reflecting the time-varying of financial market, the model optimizes the disadvantage in over-fitting historical data and improves the performance of the model outside the sample.
參考文獻	[1] Ahmad I., Basheri M., Iqbal M J., Raheem A. (2017). Improving Factor-Based Quantitative Investing by Forecasting company fundamentals. [2] Boivin J., S Ng. (2006). Are more data always better for factor analysis. Journal of Econometrics. 169-194. [3] Breiman L., J Friedman., C Stone. (1984). Classification and regression Trees. Wadsworth and Brooks Monetarey. [4] Breiman L. (2001). Random Forests. Machine Learning. 5-32. [5] Cao L., Tay F H. (2003). Support vector machine with adaptive parameters in financial time series forecasting. Neural Netw. [6] Deroski S., Enko B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine Learning. 255-273. [7] Fama Eugene F., Kenneth R French. (1992). The cross-section of expected stock returns. The Journal of Political Economy. [8] Fan A., Palaniswami M. (2001). Stock selection using support vector machines. In proceedings of the international joint conference on Neural Networks. [9] Liaw A., Wiener M. (2002). Classification and regression by randomForest. R news. 18-22. [10] Ruder S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. [11] Sharpe, W.F. (1970). Portfolio theory and capital markets. The Journal of Finance, 1-18. [12] Sharpe, W.F. (1994). The sharpe ratio. The Journal of Portfolio Management, 49-58. [13] Van der Hart J., Slagter E., Van Dijk D. (2003). Stock selection strategies in emerging markets. The Journal of Empirical Finance. 105-132. [14] Yangming Z., Guoping Q. (2018). Random forest for label ranking. 99-109. [15] Zhu M., Philpotts D., Sparks R., Stevenson M J. (2011). A hybrid approach to combining CART and logistic regression for stock ranking. The Journal of Portfolio Management. 100-109. [16] Zhu M., Philpotts D., Stevenson M J. (2012). The benefits of tree-based models for stock selection. 437-448. [17] Zheng T., Ziqin Y., Guangwei Z. (2019). Stock selection with random forest: An exploitation of excess return in the Chinese stock market. Heliyon.
描述	碩士國立政治大學金融學系 107352021
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0107352021
資料類型	thesis

dc.contributor.advisor	江彌修	zh_TW
dc.contributor.advisor	Chiang, Mi-Hsiu	en_US
dc.contributor.author (Authors)	陳昱安	zh_TW
dc.contributor.author (Authors)	Chen, Yu-An	en_US
dc.creator (作者)	陳昱安	zh_TW
dc.creator (作者)	Chen, Yu-An	en_US
dc.date (日期)	2020	en_US
dc.date.accessioned	3-Aug-2020 17:38:21 (UTC+8)	-
dc.date.available	3-Aug-2020 17:38:21 (UTC+8)	-
dc.date.issued (上傳時間)	3-Aug-2020 17:38:21 (UTC+8)	-
dc.identifier (Other Identifiers)	G0107352021	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/130991	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	金融學系	zh_TW
dc.description (描述)	107352021	zh_TW
dc.description.abstract (摘要)	本研究使用了三種傳統多因子選股模型以及結合了因子選股的兩種集成學習法為基礎之機器學習分類模型Extreme Gradient Boosting (XGBoost)、Random Forest來建構選股模型，並且比較了傳統多因子選股模型以及機器學習因子選股模型之策略績效，同時觀察兩種機器學習模型之間預測效果以及策略績效的差異性。而本研究所採用之資產標的為台灣股票市場之上市股票，樣本回測期間採自2010/1/1 至2020/1/1之所有台灣上市股票，因子特徵選取方面撇除掉過去研究常用之基本面數據，採用價量面以及台灣市場獨有之籌碼面資料，實證結果顯示，兩種機器學習模型在樣本內回測期間大幅優於傳統多因子選股模型，而在樣本外回測期間策略績效表現亦較傳統多因子選股模型出色，顯示了機器學習模型挖掘出資產報酬趨勢的能力。仔細比較XGBoost和Random Forest的策略績效後，可以發現到XGBoost優於Random Forest，表示帶有懲罰項係數防止決策樹過度擬合之XGBoost模型在樣本外表現上優於Random Forest，完整體現了金融市場隨著時間不斷變化的特性，模型優化了過度擬合歷史數據的缺點，提升模型在樣本外的性能。	zh_TW
dc.description.abstract (摘要)	In this paper, we implement three traditional multi-factor stock selection models and two ensemble-learning models including Extreme Gradient Boosting (XGBoost) and Random Forest combined with stock factors. We compared the strategy performance of traditional factor stock selection model and ensemble-learning factor stock selection model, and the difference between the prediction effectiveness and strategy performance of the two ensemble-learning models. The assets used in this research are the listed stocks in Taiwan stock market and the backtesting period is from 2010/1/1 to 2020/1/1. The stock factors include technical data and the unique trading volume data of Taiwan stock market and exclude fundamental data that commonly used in the research before. The empirical results show that the two ensemble-learning models outperform the traditional multi-factor stock selection model during the whole period. This results prove the ability of ensemble-learning models to capture asset return trend. In the results, we can also find that XGBoost outperforms Random Forest, indicating the model with penalty coefficients to prevent over-fitting outperforms, fully reflecting the time-varying of financial market, the model optimizes the disadvantage in over-fitting historical data and improves the performance of the model outside the sample.	en_US
dc.description.tableofcontents	第一章緒論 1 第二章文獻探討 5 第一節因子選股相關研究 5 第二節集成學習法之相關研究 6 第三章研究方法 7 第一節研究標的與採用之資料 7 第二節監督式分類機器學習 14 第三節本研究採用之監督式分類機器學習概念 16 第四節模型與因子選股回測設計 21 第五節策略績效衡量指標 22 第四章實證結果 27 第一節傳統多因子選股策略 28 第二節機器學習多因子選股策略 35 第三節因子選股策略績效總結 43 第五章結論與建議 46 第一節結論 46 第二節未來策略可參考建議 47 參考文獻 48	zh_TW
dc.format.extent	2961745 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0107352021	en_US
dc.subject (關鍵詞)	因子選股	zh_TW
dc.subject (關鍵詞)	機器學習	zh_TW
dc.subject (關鍵詞)	集成學習	zh_TW
dc.subject (關鍵詞)	隨機森林	zh_TW
dc.subject (關鍵詞)	台股市場	zh_TW
dc.subject (關鍵詞)	Factors stock selection	en_US
dc.subject (關鍵詞)	Machine learning	en_US
dc.subject (關鍵詞)	Ensemble-learning	en_US
dc.subject (關鍵詞)	XGBoost	en_US
dc.subject (關鍵詞)	Random forest	en_US
dc.subject (關鍵詞)	Taiwan stock market	en_US
dc.title (題名)	資產配置基於集成學習的多因子模型－以台灣股市為例	zh_TW
dc.title (題名)	Asset Allocation Based on Ensemble-Learning Assisted Multi-Factor Models– Taiwan Stock Market as an Example	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Ahmad I., Basheri M., Iqbal M J., Raheem A. (2017). Improving Factor-Based Quantitative Investing by Forecasting company fundamentals. [2] Boivin J., S Ng. (2006). Are more data always better for factor analysis. Journal of Econometrics. 169-194. [3] Breiman L., J Friedman., C Stone. (1984). Classification and regression Trees. Wadsworth and Brooks Monetarey. [4] Breiman L. (2001). Random Forests. Machine Learning. 5-32. [5] Cao L., Tay F H. (2003). Support vector machine with adaptive parameters in financial time series forecasting. Neural Netw. [6] Deroski S., Enko B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine Learning. 255-273. [7] Fama Eugene F., Kenneth R French. (1992). The cross-section of expected stock returns. The Journal of Political Economy. [8] Fan A., Palaniswami M. (2001). Stock selection using support vector machines. In proceedings of the international joint conference on Neural Networks. [9] Liaw A., Wiener M. (2002). Classification and regression by randomForest. R news. 18-22. [10] Ruder S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. [11] Sharpe, W.F. (1970). Portfolio theory and capital markets. The Journal of Finance, 1-18. [12] Sharpe, W.F. (1994). The sharpe ratio. The Journal of Portfolio Management, 49-58. [13] Van der Hart J., Slagter E., Van Dijk D. (2003). Stock selection strategies in emerging markets. The Journal of Empirical Finance. 105-132. [14] Yangming Z., Guoping Q. (2018). Random forest for label ranking. 99-109. [15] Zhu M., Philpotts D., Sparks R., Stevenson M J. (2011). A hybrid approach to combining CART and logistic regression for stock ranking. The Journal of Portfolio Management. 100-109. [16] Zhu M., Philpotts D., Stevenson M J. (2012). The benefits of tree-based models for stock selection. 437-448. [17] Zheng T., Ziqin Y., Guangwei Z. (2019). Stock selection with random forest: An exploitation of excess return in the Chinese stock market. Heliyon.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202000702	en_US

學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

Google Scholar^TM