Lasso迴歸於可詮釋預測分析：強階層與樹狀結構

Publications-Theses

Article View/Open

pdf(1681)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	Lasso迴歸於可詮釋預測分析：強階層與樹狀結構 Lasso Regression for Interpretable Predictive Analytics: Strong Hierarchy and Tree Structure
作者	陳婷文 Chen, Ting-Wen
貢獻者	莊皓鈞<br>周彥君 Chuang, Hao-Chun<br>Chou, Yen-Chun 陳婷文 Chen, Ting-Wen
關鍵詞	詮釋性 Lasso迴歸機器學習樹狀結構強階層 Interpretability Lasso regression Machine learning Tree hierarchy Strong hierarchy
日期	2020
上傳時間	3-Aug-2020 17:35:24 (UTC+8)
摘要	有鑒於數據分析被廣泛應用在不同問題領域，且近年來資料筆數與變數數目大幅增加，以機器學習建構的預測模型因而興起，其中隨機森林和梯度提升機運用集成樹演算法，能在模型內納入自變數與依變數間的非線性關係並處理高維度資料，提升預測準確度。然而這類模型缺乏解釋性，在商業領域如金融授信風險評估難以使用，故產業界仍倚賴具高透通性的迴歸模型，但一般而言其預測準確度低於解釋性弱的集成式學習。本研究利用在高維建模相當重要的Lasso迴歸相關技術，探討兩個可大幅改善迴歸模型預測準確度並保留解釋性的方案，一為由Lim and Hastie (2015)提出運用自變數交互項拓展維度，但保留強階層使模型易解釋的Hierarchical group-lasso regularization，二為本研究提出的Cluster-while-regression with tree hierarchy，後者將樣本同步分群與訓練後產出數個迴歸模型，以分群加入非線性關係，結合樹狀結構與各子葉Lasso迴歸，以混合整數規劃進行訓練，達成模型的全域最佳化。接著以不同資料集比較以上所提到的五種演算法後，本研究運用的兩種強化版迴歸模型預測表現皆顯著優於Lasso迴歸，我們所提出的Cluster-while-regression with tree hierarchy預測準確度更不遜於隨機森林與梯度提升機，並保留高可解釋性，對可詮釋人工智慧有所貢獻。 Due to the availability of observational data and variables, predictive machine learning has been widely applied in different fields. Random Forests and Gradient Boosting Machine are two popular machine learning models which use ensemble trees to incorporate the nonlinear relationship between independent and dependent variables and to process high-dimensional data, resulting in improved prediction accuracy. However, these models are lack of interpretability and hence not applicable to business situations like credit risk assessment. As a results, practitioners still rely on the regression model for interpretability. To improve prediction accuracy, Lasso regression is a key technique to include high-dimensional data while avoiding overfitting. In this study, we discuss two Lasso-based models that can greatly improve prediction accuracy while retaining interpretability. One is Hierarchical group-lasso regularization, which was proposed by Lim and Hastie (2015) and uses interaction terms to expand the dimension and further enforces strong hierarchy to make the model easy to interpret. The other is Cluster-while-regression with tree hierarchy, which adds nonlinear relationships by clustering. This model simultaneously considers tree structure for clustering and runs Lasso regression for each cluster. A mixed-integer programming is applied to achieve global optimization of the model. These two enhanced Lasso regression models performs better than the traditional Lasso regression model in different datasets. Cluster-while-regression with tree hierarchy even performs not worse than Random Forests and Gradient Boosting Machine and at the same time retain high interpretability. Our study thus contributes to interpretable artificial intelligence.
參考文獻	Alpaydin, E. (2020). Introduction to machine learning (4th ed.), America: MIT press. Baardman, L., Levin, I., Perakis, G., & Singhvi, D. (2018). Leveraging comparables for new product sales forecasting. Production and Operations Management, 27(12), 2339-2449. Bien, J., Taylor, J., & Tibshirani, R. (2013). A lasso for hierarchical interactions. Annals of statistics, 41(3), 1111. Cox, D. R. (1984). Interaction. International Statistical Review/Revue Internationale de Statistique, 52(1), 1-24. DeSarbo, W. S., Oliver, R. L., & Rangaswamy, A. (1989). A simulated annealing methodology for clusterwise linear regression. Psychometrika, 54(4), 707-736. Dunn, J. W. (2018). Optimal trees for prediction and prescription (Doctoral dissertation, Massachusetts Institute of Technology, Massachusetts, America). Retrieved from http://dspace.mit.edu/handle/1721.1/7582 Farrar, D. E., & Glauber, R. R. (1967). Multicollinearity in regression analysis: the problem revisited. The Review of Economic and Statistics, 49(1), 92-107. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1. Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. Hu, K., Acimovic, J., Erize, F., Thomas, D. J., & Van Mieghem, J. (2019). Forecasting new product life cycle curves: Practical approach and empirical analysis. Manufacturing and Service Operations Management, 21(1), 66-85. Lim, M., & Hastie, T. (2015). Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics, 24(3), 627-654. McCullagh, P., & Nelder, J. A. (1989). Generalized Linear Models(2nd ed.), America: CRC Press. Ogutu, J. O., Schulz-Streeck, T., & Piepho, H. P. (2012). Genomic selection using regularized linear regression models: ridge regression, Lasso, elastic net and their extensions. BMC proceedings, 6, S10. Park, Y. W., Jiang, Y., Klabjan, D., & Williams, L. (2017). Algorithms for generalized clusterwise linear regression. INFORMS Journal on Computing, 29(2), 301-317. Russel, S., & Norvig, P. (2013). Artificial intelligence: a modern approach, America: Pearson Education Limited. Späth, H. (1979). Algorithm 39 clusterwise linear regression. Computing, 22(4), 367-373. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. Yang, L., Liu, S., Tsoka, S., & Papageorgiou, L. G. (2017). A regression tree approach using mathematical programming. Expert Systems with Applications, 78, 347-357. Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49-67. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301-320.
描述	碩士國立政治大學資訊管理學系 107356008
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0107356008
資料類型	thesis

dc.contributor.advisor	莊皓鈞<br>周彥君	zh_TW
dc.contributor.advisor	Chuang, Hao-Chun<br>Chou, Yen-Chun	en_US
dc.contributor.author (Authors)	陳婷文	zh_TW
dc.contributor.author (Authors)	Chen, Ting-Wen	en_US
dc.creator (作者)	陳婷文	zh_TW
dc.creator (作者)	Chen, Ting-Wen	en_US
dc.date (日期)	2020	en_US
dc.date.accessioned	3-Aug-2020 17:35:24 (UTC+8)	-
dc.date.available	3-Aug-2020 17:35:24 (UTC+8)	-
dc.date.issued (上傳時間)	3-Aug-2020 17:35:24 (UTC+8)	-
dc.identifier (Other Identifiers)	G0107356008	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/130976	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊管理學系	zh_TW
dc.description (描述)	107356008	zh_TW
dc.description.abstract (摘要)	有鑒於數據分析被廣泛應用在不同問題領域，且近年來資料筆數與變數數目大幅增加，以機器學習建構的預測模型因而興起，其中隨機森林和梯度提升機運用集成樹演算法，能在模型內納入自變數與依變數間的非線性關係並處理高維度資料，提升預測準確度。然而這類模型缺乏解釋性，在商業領域如金融授信風險評估難以使用，故產業界仍倚賴具高透通性的迴歸模型，但一般而言其預測準確度低於解釋性弱的集成式學習。本研究利用在高維建模相當重要的Lasso迴歸相關技術，探討兩個可大幅改善迴歸模型預測準確度並保留解釋性的方案，一為由Lim and Hastie (2015)提出運用自變數交互項拓展維度，但保留強階層使模型易解釋的Hierarchical group-lasso regularization，二為本研究提出的Cluster-while-regression with tree hierarchy，後者將樣本同步分群與訓練後產出數個迴歸模型，以分群加入非線性關係，結合樹狀結構與各子葉Lasso迴歸，以混合整數規劃進行訓練，達成模型的全域最佳化。接著以不同資料集比較以上所提到的五種演算法後，本研究運用的兩種強化版迴歸模型預測表現皆顯著優於Lasso迴歸，我們所提出的Cluster-while-regression with tree hierarchy預測準確度更不遜於隨機森林與梯度提升機，並保留高可解釋性，對可詮釋人工智慧有所貢獻。	zh_TW
dc.description.abstract (摘要)	Due to the availability of observational data and variables, predictive machine learning has been widely applied in different fields. Random Forests and Gradient Boosting Machine are two popular machine learning models which use ensemble trees to incorporate the nonlinear relationship between independent and dependent variables and to process high-dimensional data, resulting in improved prediction accuracy. However, these models are lack of interpretability and hence not applicable to business situations like credit risk assessment. As a results, practitioners still rely on the regression model for interpretability. To improve prediction accuracy, Lasso regression is a key technique to include high-dimensional data while avoiding overfitting. In this study, we discuss two Lasso-based models that can greatly improve prediction accuracy while retaining interpretability. One is Hierarchical group-lasso regularization, which was proposed by Lim and Hastie (2015) and uses interaction terms to expand the dimension and further enforces strong hierarchy to make the model easy to interpret. The other is Cluster-while-regression with tree hierarchy, which adds nonlinear relationships by clustering. This model simultaneously considers tree structure for clustering and runs Lasso regression for each cluster. A mixed-integer programming is applied to achieve global optimization of the model. These two enhanced Lasso regression models performs better than the traditional Lasso regression model in different datasets. Cluster-while-regression with tree hierarchy even performs not worse than Random Forests and Gradient Boosting Machine and at the same time retain high interpretability. Our study thus contributes to interpretable artificial intelligence.	en_US
dc.description.tableofcontents	第一章緒論 1 第一節研究背景與動機 1 第二節研究目的與貢獻 2 第二章文獻回顧與探討 4 第一節迴歸模型正規化 4 一、線性迴歸 4 二、Ridge regression 5 三、Lasso regression (‎Least absolute shrinkage and selection operator regression) 6 四、Elastic net 7 第二節 Cluster-while-estimate演算法 7 第三章研究方法 10 第一節 Hierarchical group-lasso regularization演算法 10 第二節 Cluster-while-regression with tree hierarchy演算法 14 第四章研究結果 18 第一節連續性依變數資料集 18 第二節二元依變數資料集 31 第五章結論 41 第六章參考文獻 42	zh_TW
dc.format.extent	2183082 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0107356008	en_US
dc.subject (關鍵詞)	詮釋性	zh_TW
dc.subject (關鍵詞)	Lasso迴歸	zh_TW
dc.subject (關鍵詞)	機器學習	zh_TW
dc.subject (關鍵詞)	樹狀結構	zh_TW
dc.subject (關鍵詞)	強階層	zh_TW
dc.subject (關鍵詞)	Interpretability	en_US
dc.subject (關鍵詞)	Lasso regression	en_US
dc.subject (關鍵詞)	Machine learning	en_US
dc.subject (關鍵詞)	Tree hierarchy	en_US
dc.subject (關鍵詞)	Strong hierarchy	en_US
dc.title (題名)	Lasso迴歸於可詮釋預測分析：強階層與樹狀結構	zh_TW
dc.title (題名)	Lasso Regression for Interpretable Predictive Analytics: Strong Hierarchy and Tree Structure	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Alpaydin, E. (2020). Introduction to machine learning (4th ed.), America: MIT press. Baardman, L., Levin, I., Perakis, G., & Singhvi, D. (2018). Leveraging comparables for new product sales forecasting. Production and Operations Management, 27(12), 2339-2449. Bien, J., Taylor, J., & Tibshirani, R. (2013). A lasso for hierarchical interactions. Annals of statistics, 41(3), 1111. Cox, D. R. (1984). Interaction. International Statistical Review/Revue Internationale de Statistique, 52(1), 1-24. DeSarbo, W. S., Oliver, R. L., & Rangaswamy, A. (1989). A simulated annealing methodology for clusterwise linear regression. Psychometrika, 54(4), 707-736. Dunn, J. W. (2018). Optimal trees for prediction and prescription (Doctoral dissertation, Massachusetts Institute of Technology, Massachusetts, America). Retrieved from http://dspace.mit.edu/handle/1721.1/7582 Farrar, D. E., & Glauber, R. R. (1967). Multicollinearity in regression analysis: the problem revisited. The Review of Economic and Statistics, 49(1), 92-107. Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1), 1. Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. Hu, K., Acimovic, J., Erize, F., Thomas, D. J., & Van Mieghem, J. (2019). Forecasting new product life cycle curves: Practical approach and empirical analysis. Manufacturing and Service Operations Management, 21(1), 66-85. Lim, M., & Hastie, T. (2015). Learning interactions via hierarchical group-lasso regularization. Journal of Computational and Graphical Statistics, 24(3), 627-654. McCullagh, P., & Nelder, J. A. (1989). Generalized Linear Models(2nd ed.), America: CRC Press. Ogutu, J. O., Schulz-Streeck, T., & Piepho, H. P. (2012). Genomic selection using regularized linear regression models: ridge regression, Lasso, elastic net and their extensions. BMC proceedings, 6, S10. Park, Y. W., Jiang, Y., Klabjan, D., & Williams, L. (2017). Algorithms for generalized clusterwise linear regression. INFORMS Journal on Computing, 29(2), 301-317. Russel, S., & Norvig, P. (2013). Artificial intelligence: a modern approach, America: Pearson Education Limited. Späth, H. (1979). Algorithm 39 clusterwise linear regression. Computing, 22(4), 367-373. Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267-288. Yang, L., Liu, S., Tsoka, S., & Papageorgiou, L. G. (2017). A regression tree approach using mathematical programming. Expert Systems with Applications, 78, 347-357. Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49-67. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2), 301-320.	zh_TW
dc.identifier.doi (DOI)	10.6814/NCCU202001103	en_US

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM