Please use this identifier to cite or link to this item:

Title: 可加性迴歸模型的變數選擇
Variable selection for addititve regression model
Authors: 謝嘉倫
Hsieh, Jia-Lun
Contributors: 黃子銘
Huang, Tzee-Ming
Hsieh, Jia-Lun
Keywords: B樣條
Variable selection
Function estimation
Nonparametric regression
Date: 2021
Issue Date: 2021-08-05 10:21:41 (UTC+8)
Abstract: 可加性迴歸模型適合處理非線性資料,在實踐的過程中需估計模型中的未知函數,本文以spline smoothing的技術估計函數並採用B-spline基底。在這樣的估計過程中容易使模型的參數量增加,造成更大的運算負擔,同時在實務研究中不僅重視數學模型配適更著重於模型的可解釋性,因此適當的變數選擇就顯得非常重要。本文旨在討論使用可加性模型時所適用的變數選擇方法,並分別討論資料中僅含數值變數和同時包含數值變數及類變數的情形。在最後兩組模擬實驗互相比較後得出結論:用於僅含有數值變數的選擇方法運算速度快且效果優良。而用於同時包含兩種變數的選擇方法運算量較大速度也較慢,但若將其簡化只用於辨識相關的類別變數可能降低運算量並有良好的變數選擇效果。
Additive regression model is suitable for nonlinear data. To practice it, we need to estimate the unknown functions in the model. In this thesis, we will estimate the functions by spline smoothing and B-spline basis. In the process, it is easy to increase the number of parameters in the model resulting in greater burden on computation. Besides, not only the effection but also the interpretability of the model is emphasized in the practical research of quantitative analysis. As the result, variable selection is important. We aim to discuss the variable selection for additive model and diverse methods for data containing only continuous variables or containing both continuous variables and categorical variables. After comparing two simulation, it is concluded that the method for data with only continuous variables is efficient and effective, the other method for data with both continuous variables and categorical variables is more complex and less efficient but it is still worth when we only identify relevant categorical variables by the method.
Reference: Variyath. AM and Brobbey.A. Variable selection in multivariate multiple regression. PLOS ONE, 15(7),2020.

K.Ulm and A.Hapfelmeier. A new variable selection approach using random forests.Computational StatisticsandDataAnalysis, 60:50-69,2013.

Xuyuan Li, HolgerR.Maier, and AaronC.Zecchin. Improved pmi-based input variable selection approach for artifcial neural network and other data driven environmental and water resource models. Environmental Modelling and Software, 65:15-29, 2015.

Jerome H. Friedman and Werner Stuetzle. Projection pursuit regression. Journal of theAmerican Statistical Association, 76,1981.

Geoffrey S. Watson. Smooth regression analysis. The Indian Journal of Statistics, Series A, 26(4):359-372,1964.

Charles J. Stone. Additive regression and other nonparametric models. Annals of Statistics, 13(2),1985.

I.J.Schoenberg. Contributions to the problem of approximation of equidistant data by analytic functions. Quarterly of Applied Mathematics, 4(1):45-99,1946.

Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58:267-288,1996.

Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, 68:49-67,2006.

Jian Huang, Joel L.Horowitz,and Fengrong Wei. variable selection in nonparametric additive models. The Annals of Statistics, 38(4):2282-2313,2010.

Xia Cui, Heng Peng,Songqiao Wen,and Lixing Zhu. Component selection in the additive regression model. Scandinavian Journal of Statistics, 40(3):491-510,2013.

Miao Yang, Lan Xue, and Lijian Yang. Variable selection for additive model via cumulative ratios of empirical strengths total. Journal of Nonparametric Statistics,28(3):595-616, 2016.

Shujie Ma and Jeffrey S. Racine. Additive regression splines with irrelevant categorical and continuous regressors. Statistica Sinica, 23:515-541,2013.

Carl de Boor.On calculating with b-splines. Journal of Approximation Theory,6(1):50-62, 2015.
Description: 碩士
Source URI:
Data Type: thesis
Appears in Collections:[統計學系] 學位論文

Files in This Item:

File Description SizeFormat
402401.pdf1130KbAdobe PDF0View/Open

All items in 學術集成 are protected by copyright, with all rights reserved.

社群 sharing