Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/108112
題名: LASSO迴歸在B-spline基底組成之危險函數上的應用
Application of LASSO regression in estimating B-Spline-Based hazard functions
作者: 林子元
Lin, Zi-Yuan
貢獻者: 黃子銘
Huang, Tzee-Ming
林子元
Lin, Zi-Yuan
關鍵詞: 比例危險模型
B樣條
Group lasso
拔靴法
Proportional hazards model
B-splines
Group lasso
Bootstrap
日期: 2017
上傳時間: 5-Apr-2017
摘要: 一項關於比例危險模型的重要假設為對數危險函數與共變量之間的關係是線性的,本文探討當此假設不成立時,使用B樣條基底函數來近似共變量的非線性函數是可行的。在估計上,本文應用了group lasso方法。在適當的懲罰係數之下,對於不具解釋力的共變量而言,此方法可使對應至該共變量的一組基底係數同時估為零,以避免模型難以解讀的狀況。此外,本文嘗試為所提模型發展假設檢定。考慮的檢定量除了一般的Wald檢定量、概似比檢定量與分數檢定量之外,尚包括了因應懲罰項而作校正的檢定量與基於拔靴法的檢定量。本文採用模擬的方法比較各檢定量的優劣。
A strong assumption in the Cox proportional hazards model requires linearity of the covariates on the log hazard function. However, this assumption may be violated in practice. Alternatively, it is feasible to model the nonlinear effect via a combination of B-spline basis functions. In estimating the basis coefficients, the group lasso is applied. By so doing, a group of coefficients can be set zero simultaneously if the corresponding covariate is not predictive. Lastly, I develop hypothesis testing regarding this model. In addition to the ordinary Wald statistic, likelihood ratio statistic, and score statistic, two other types of testing statistic are considered: one adjust for penalty function and the other one based on bootstrap samples. Simulation studies are carried out to evaluate the performance of the proposed statistics.
參考文獻: [1] Bøvelstad, H. M., Nygård, S., Størvold, H. L., Aldrin, M., Borgan, Ø., Frigessi, A., and Lingjærde, O. C. (2007). Predicting survival from microarray data—a comparative study. Bioinformatics 23 (16), 2080-2087.\n[2] Brent, R. P. (1973). Algorithms for minimization without derivatives. Prentice Hall.\n[3] Breslow, N. E. (1972). Contribution to the discussion of paper by D. R. Cox. Journal of the Royal Statistical Society, Series B 34, 216-217.\n[4] Breslow, N. E. and Crowley, J. (1974). A large-sample study of the life table and product limit estimates under random censorship. Annals of Statistics 2, 437-454.\n[5] Burr, D. (1994). A comparison of certain bootstrap confidence intervals in the Cox model. Journal of the American Statistical Association 89, 1290-1302.\n[6] Cox, D. R. (1972). Regression models and life-table (with discussion). Journal of the Royal Statistical Society, Series B 34, 187-220.\n[7] Cox, D. R. (1975). Partial likelihood. Biometrika 62, 269-276.\n[8] Curry, H. B. and Schoenberg, I. J. (1966). On Pólya Frequency Functions IV: The fundamental spline functions and their limits. Journale d`Analyse Mathématique 17, 71-107.\n[9] de Boor, C. (1978). A Practical Guide to Splines. New York: Springer.\n[10] Efron, B. (1977). The efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association 72, 557-565.\n[11] Efron, B. (1979). Bootstrap methods: another look at the jackknife. Annals of Statistics 7, 1-26.\n[12] Gray, R. J. (1992). Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. Journal of the American Statistical Association 87, 942-951.\n[13] Gray, R. J. (1994). Spline-based tests in survival analysis. Biometrics 50, 640-652.\n[14] Hastie, T. and Tibshirani, R. (1990). Exploring the nature of covariate effects in the proportional hazards model. Biometrics 46, 1005-1016.\n[15] Huang, J. Z. and Liu, L. (2006). Polynomial spline estimation and inference of proportional hazards regression models with flexible relative risk form. Biometrics 62, 793-802.\n[16] Kalbfleisch, J. D. and Prentice, R. L. (1973). Marginal likelihoods based on Cox’s regression and life model. Biometrika 60, 267-278.\n[17] Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association 53, 457-481.\n[18] Keele, L. (2010). Proportionally difficult: testing for nonproportional hazards in Cox models. Political Analysis 18, 189-205.\n[19] Kim, J., Sohn, I., Jung, S. H., Kim, S., and Park, C. (2012). Analysis of survival data with group Lasso. Communications in Statistics—Simulation and Computation 41, 1593-1605.\n[20] Kooperberg, C., Stone, C. J., and Truong, Y. K. (1995). Hazard regression. Journal of the American Statistical Association 90, 78-94.\n[21] LeBlanc, M. and Crowley, J. (1999). Adaptive regression splines in the Cox model. Biometrics 55, 204-213.\n[22] Lenhoff, M. W., Santner, T. J., Otis, J. C., Peterson, M. G. E., Williams, B. J., and Backus, S. I. (1999). Bootstrap prediction and confidence bands: a superior statistical method for analysis of gait data. Gait & Posture 9 (1), 10-17.\n[23] Li, W., Xu, S., Zhao, G., and Goh, L. P. (2005). Adaptive knot placement in B-spline curve approximation. Computer-Aided Design 37, 791-797.\n[24] Liu, D. C. and Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming 45, 503-528.\n[25] Lockhart, R., Taylor, J., Tibshirani, R., and Tibshirani, R. J. (2014). A significance test for the lasso (with discussion). Annals of Statistics 42 (2), 413-468.\n[26] Meier, L., van de Geer, S., and Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society Series B-Statistical Methodology 70, 53-71.\n[27] Moody, J. E., Hanson, S. J., and Lippmann, R. P. (1992). The effective number of parameters: an analysis of generalization and regularization in nonlinear learning system. Advances in Neural Information Processing System 4, 847-854.\n[28] O`Sullivan, F. (1988). Nonparametric estimation of relative risk using splines and cross-validation. SIAM Journal on Scientific and Statistical Computing 9, 531-542.\n[29] Sleeper, L. A. and Harrington, D. P. (1990). Regression splines in the Cox model with application to covariate effects in liver disease. Journal of the American Statistical Association 85, 941-949.\n[30] Stone, C. J. (1985). Additive regression and other nonparametric models. Annals of Statistics 13, 689-705.\n[31] Therneau, T. M., Grambsch, P. M., and Pankratz, V. S. (2003). Penalized survival models and frailty. Journal of Computational and Graphical Statistics 12 (1), 156-175.\n[32] Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society Series B-Methodological 58 (1), 267-288.\n[33] Tibshirani, R. (1997). The LASSO method for variable selection in the Cox model. Statistics in Medicine 16 (4), 385-395.\n[34] Verweij, P. J. M. and van Houwelingen, H. C. (1993). Cross-validation in survival analysis. Statistics in Medicine 12, 2305-2314.\n[35] Wold, S. (1974). Spline functions in data analysis. Technometrics 16, 1-11.\n[36] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B-Statistical Methodology 86 (1), 49-67.
描述: 碩士
國立政治大學
統計學系
103354014
資料來源: http://thesis.lib.nccu.edu.tw/record/#G1033540143
資料類型: thesis
Appears in Collections:學位論文

Files in This Item:
File SizeFormat
014301.pdf1.65 MBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.