Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

題名 多因子組合方法於台灣市場之實證研究:傳統動能、 降維與機器學習的綜合評估
An Empirical Study of Multi-Factor Combination Methods in the Taiwan Market: A Comprehensive Evaluation of Traditional Momentum, Dimensionality Reduction, and Machine Learning Approaches
作者 陳昇華
Chen, Sheng-Hua
貢獻者 林士貴
Lin, Shih-Kuei
陳昇華
Sheng-Hua Chen
關鍵詞 多因子
機器學習
動能
降維
交易策略
因子合成
Multi-Factor
Momentum
Dimensionality Reduction
Machine Learning
Factor Combination
Trading Strategy
日期 2025
上傳時間 4-Aug-2025 14:33:19 (UTC+8)
摘要 本研究檢驗 2010 至 2025 年間 52 項涵蓋估值、成長、獲利、品質、技術面與流動性維度的公司層級因子在臺灣股市的橫斷面報酬預測能力,資料取自 TEJ,並透過 MAD 截尾、規模與產業中性化及 Z 分數標準化三道程序處理,以確保訊號穩健且可比較。單因子分析顯示,價值與品質因子的資訊係數及資訊比率表現最佳,而動能、規模與風險導向因子波動較大。為整合多因子訊息,我們比較等權重組合、PCA、橫斷面與時間序列因子動能,以及 CatBoost、XGBoost 與 LightGBM 等梯度提升排序模型,結果以 LightGBM 最優,全市場樣本期間年化報酬率 15.90\%、夏普比率 2.50、最大回撤 $-5.90$\%;於波動性較高、流動性較低的 OTC 市場同樣取得 21.29\% 年化報酬、夏普比率 5.13 與 Calmar 比率 6.45,明顯超越動能與 PCA 基準,顯示集成樹模型能有效捕捉傳統線性架構難以掌握的非線性因子交互作用。本研究首次系統性驗證美股預測因子於臺灣市場的可現性,提出減少極端值與非預期風格曝險的嚴謹前處理與模型比較流程,並提供全市場與細分市場實證,證明 LightGBM 目前是臺灣多因子選股的最佳實務途徑,未來可進一步納入交易成本、槓桿限制與深度學習合成因子,縮短學術與可投資實務的距離。
This study examines cross‑sectional return predictability in the Taiwan equity market from 2010 to 2025. We analyze a curated library of 52 firm‑level predictors spanning valuation, growth, profitability, quality, technical, and liquidity dimensions. Daily data from the Taiwan Economic Journal (TEJ) are processed through a three‑step pipeline—Median Absolute Deviation (MAD) clipping, size‑ and industry‑neutralization, and Z‑score standardization—to ensure signal comparability and robustness. Single‑factor tests show that value and quality variables deliver the highest Information Coefficients (IC) and Information Ratios (IR), whereas momentum, size, and risk‑oriented factors exhibit more volatile performance. To synthesize information across predictors, we compare four classes of aggregation techniques: (i) equal‑weight combinations, (ii) Principal Component Analysis (PCA), (iii) cross‑sectional and time‑series factor momentum (CSFM / TSFM), and (iv) three gradient‑boosting rankers—CatBoost, XGBoost, and LightGBM. Among these, LightGBM attains the strongest out‑of‑sample results, recording an annualized return of 15.90%, a Sharpe ratio of 2.50, and a maximum drawdown of only −5.90% on the whole‑market sample. Robustness tests on the more volatile and less liquid OTC segment confirm the superiority of machine‑learning models: LightGBM still achieves a 21.29% annualized return, a Sharpe ratio of 5.13, and a Calmar ratio of 6.45, comfortably outperforming traditional momentum and PCA benchmarks. These findings underscore the adaptability of ensemble‑tree models in emerging markets and highlight their capacity to capture nonlinear factor interactions that conventional linear or momentum frameworks may overlook. Our contributions are three‑fold: (i) we provide the first comprehensive transferability test of U.S.‑validated predictors to Taiwan, (ii) we propose a rigorous preprocessing and model‑comparison protocol that mitigates extreme values and unintended style exposures, and (iii) we furnish market‑wide and segment‑specific evidence that LightGBM currently offers the most effective route to multi‑factor stock selection in Taiwan. Future research can extend this framework by incorporating dynamic transaction‑cost models, leverage constraints, and deep‑learning‑based factor integrators to further bridge the gap between academic insight and investable practice.
參考文獻 [1] Dhingra, V., Sharma, A., & Gupta, S. K. (2023). Sectoral portfolio optimization by judicious selection of financial ratios via PCA. Optimization and Engineering, 25(3), 1431–1468. [2] Engelberg, J., McLean, R. D., Pontiff, J., & Ringgenberg, M. C. (2023). Do cross‑sectional predictors contain systematic information? Journal of Financial and Quantitative Analysis, 58(3), 1172–1201. [3] Gu, S., Kelly, B. T., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223–2270. [4] Gupta, T., & Kelly, B. T. (2018). Factor momentum everywhere (SSRN Scholarly Paper No. 3300728). Social Science Research Network. [5] Scruggs, J. T. (2021, August). Does neutralizing style factors help or hurt? The Journal of Investing, 30(3). [6] Wei, X., Tian, Y., Li, N., & Peng, H. (2024). Evaluating ensemble learning techniques for stock index trend prediction: A case of China. Portuguese Economic Journal, 23(3), 505–530.
描述 碩士
國立政治大學
金融學系
112352034
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112352034
資料類型 thesis
dc.contributor.advisor 林士貴zh_TW
dc.contributor.advisor Lin, Shih-Kueien_US
dc.contributor.author (Authors) 陳昇華zh_TW
dc.contributor.author (Authors) Sheng-Hua Chenen_US
dc.creator (作者) 陳昇華zh_TW
dc.creator (作者) Chen, Sheng-Huaen_US
dc.date (日期) 2025en_US
dc.date.accessioned 4-Aug-2025 14:33:19 (UTC+8)-
dc.date.available 4-Aug-2025 14:33:19 (UTC+8)-
dc.date.issued (上傳時間) 4-Aug-2025 14:33:19 (UTC+8)-
dc.identifier (Other Identifiers) G0112352034en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/158594-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 金融學系zh_TW
dc.description (描述) 112352034zh_TW
dc.description.abstract (摘要) 本研究檢驗 2010 至 2025 年間 52 項涵蓋估值、成長、獲利、品質、技術面與流動性維度的公司層級因子在臺灣股市的橫斷面報酬預測能力,資料取自 TEJ,並透過 MAD 截尾、規模與產業中性化及 Z 分數標準化三道程序處理,以確保訊號穩健且可比較。單因子分析顯示,價值與品質因子的資訊係數及資訊比率表現最佳,而動能、規模與風險導向因子波動較大。為整合多因子訊息,我們比較等權重組合、PCA、橫斷面與時間序列因子動能,以及 CatBoost、XGBoost 與 LightGBM 等梯度提升排序模型,結果以 LightGBM 最優,全市場樣本期間年化報酬率 15.90\%、夏普比率 2.50、最大回撤 $-5.90$\%;於波動性較高、流動性較低的 OTC 市場同樣取得 21.29\% 年化報酬、夏普比率 5.13 與 Calmar 比率 6.45,明顯超越動能與 PCA 基準,顯示集成樹模型能有效捕捉傳統線性架構難以掌握的非線性因子交互作用。本研究首次系統性驗證美股預測因子於臺灣市場的可現性,提出減少極端值與非預期風格曝險的嚴謹前處理與模型比較流程,並提供全市場與細分市場實證,證明 LightGBM 目前是臺灣多因子選股的最佳實務途徑,未來可進一步納入交易成本、槓桿限制與深度學習合成因子,縮短學術與可投資實務的距離。zh_TW
dc.description.abstract (摘要) This study examines cross‑sectional return predictability in the Taiwan equity market from 2010 to 2025. We analyze a curated library of 52 firm‑level predictors spanning valuation, growth, profitability, quality, technical, and liquidity dimensions. Daily data from the Taiwan Economic Journal (TEJ) are processed through a three‑step pipeline—Median Absolute Deviation (MAD) clipping, size‑ and industry‑neutralization, and Z‑score standardization—to ensure signal comparability and robustness. Single‑factor tests show that value and quality variables deliver the highest Information Coefficients (IC) and Information Ratios (IR), whereas momentum, size, and risk‑oriented factors exhibit more volatile performance. To synthesize information across predictors, we compare four classes of aggregation techniques: (i) equal‑weight combinations, (ii) Principal Component Analysis (PCA), (iii) cross‑sectional and time‑series factor momentum (CSFM / TSFM), and (iv) three gradient‑boosting rankers—CatBoost, XGBoost, and LightGBM. Among these, LightGBM attains the strongest out‑of‑sample results, recording an annualized return of 15.90%, a Sharpe ratio of 2.50, and a maximum drawdown of only −5.90% on the whole‑market sample. Robustness tests on the more volatile and less liquid OTC segment confirm the superiority of machine‑learning models: LightGBM still achieves a 21.29% annualized return, a Sharpe ratio of 5.13, and a Calmar ratio of 6.45, comfortably outperforming traditional momentum and PCA benchmarks. These findings underscore the adaptability of ensemble‑tree models in emerging markets and highlight their capacity to capture nonlinear factor interactions that conventional linear or momentum frameworks may overlook. Our contributions are three‑fold: (i) we provide the first comprehensive transferability test of U.S.‑validated predictors to Taiwan, (ii) we propose a rigorous preprocessing and model‑comparison protocol that mitigates extreme values and unintended style exposures, and (iii) we furnish market‑wide and segment‑specific evidence that LightGBM currently offers the most effective route to multi‑factor stock selection in Taiwan. Future research can extend this framework by incorporating dynamic transaction‑cost models, leverage constraints, and deep‑learning‑based factor integrators to further bridge the gap between academic insight and investable practice.en_US
dc.description.tableofcontents Contents iv List of Figures vii List of Tables viii 1 Introduction 1 2 Literature Review 2  2.1 Factors 2  2.2 Factor Preprocessing 3  2.3 Methods of Factor Combination 4 3 Methodology 6  3.1 Time‑series Factor Momentum (TSFM) 6  3.2 Cross‑Sectional Factor Momentum (CSFM) 9  3.3 Principal Components Analysis 11  3.4 Ensemble Machine Learning Models 13   3.4.1 CatBoost Ranking Model 15   3.4.2 XGBoost Ranking Model 16   3.4.3 LightGBM Ranking Model 17  3.5 Machine Learning Evaluation Metrics 18  3.6 Experimental Design 19   3.6.1 Factor Preprocessing 20   3.6.2 Back‑testing Assumptions 22 4 Empirical Results 26  4.1 Data and Factor Tables 26  4.2 Single Factor Analysis 27  4.3 Composite Factor Analysis 28   4.3.1 Equal‑Weight Composite (Benchmark) 29   4.3.2 TSFM (1, 12) 31   4.3.3 CSFM (1, 12) 33   4.3.4 TSFM (1, 1) 35   4.3.5 CSFM (1, 1) 37   4.3.6 PCA Composite 39   4.3.7 CatBoost Composite 41   4.3.8 XGBoost Composite 43   4.3.9 LightGBM Composite 45  4.4 Back‑testing 47   4.4.1 Comparison of Momentum Strategies 48   4.4.2 Comparison of PCA and Machine Learning Methods 50   4.4.3 Overall Method Comparison 52  4.5 Robustness Test 54   4.5.1 Listed Stocks Only 54   4.5.2 OTC Only 56 5 Conclusions 59 A Factor List and Performance Tables 62  A.1 Factor List 62   A.1.1 Single‑Factor Performance Metrics 66 Reference 69zh_TW
dc.format.extent 2343090 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112352034en_US
dc.subject (關鍵詞) 多因子zh_TW
dc.subject (關鍵詞) 機器學習zh_TW
dc.subject (關鍵詞) 動能zh_TW
dc.subject (關鍵詞) 降維zh_TW
dc.subject (關鍵詞) 交易策略zh_TW
dc.subject (關鍵詞) 因子合成zh_TW
dc.subject (關鍵詞) Multi-Factoren_US
dc.subject (關鍵詞) Momentumen_US
dc.subject (關鍵詞) Dimensionality Reductionen_US
dc.subject (關鍵詞) Machine Learningen_US
dc.subject (關鍵詞) Factor Combinationen_US
dc.subject (關鍵詞) Trading Strategyen_US
dc.title (題名) 多因子組合方法於台灣市場之實證研究:傳統動能、 降維與機器學習的綜合評估zh_TW
dc.title (題名) An Empirical Study of Multi-Factor Combination Methods in the Taiwan Market: A Comprehensive Evaluation of Traditional Momentum, Dimensionality Reduction, and Machine Learning Approachesen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Dhingra, V., Sharma, A., & Gupta, S. K. (2023). Sectoral portfolio optimization by judicious selection of financial ratios via PCA. Optimization and Engineering, 25(3), 1431–1468. [2] Engelberg, J., McLean, R. D., Pontiff, J., & Ringgenberg, M. C. (2023). Do cross‑sectional predictors contain systematic information? Journal of Financial and Quantitative Analysis, 58(3), 1172–1201. [3] Gu, S., Kelly, B. T., & Xiu, D. (2020). Empirical asset pricing via machine learning. The Review of Financial Studies, 33(5), 2223–2270. [4] Gupta, T., & Kelly, B. T. (2018). Factor momentum everywhere (SSRN Scholarly Paper No. 3300728). Social Science Research Network. [5] Scruggs, J. T. (2021, August). Does neutralizing style factors help or hurt? The Journal of Investing, 30(3). [6] Wei, X., Tian, Y., Li, N., & Peng, H. (2024). Evaluating ensemble learning techniques for stock index trend prediction: A case of China. Portuguese Economic Journal, 23(3), 505–530.zh_TW