學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 兩種基於B-Spline迴歸模型之節點選取演算法比較
A comparative study of two knot selection algorithms for B-Spline regression
作者 王姿尹
Wang, Zih-Yin
貢獻者 黃子銘
王姿尹
Wang, Zih-Yin
關鍵詞 B-Spline迴歸模型
節點選取
B-Spline tensor
B-Spline regression model
Knot selection
B-Spline tensor
日期 2019
上傳時間 5-Feb-2020 17:07:25 (UTC+8)
摘要 本文聚焦於B-Spline迴歸模型之節點選取議題,分別以MSE(Mean Square Error)、MSE的變異數與節點估計時間為指標,比較兩種演算法估計效果、執行穩定度與執行效率之優劣。其一為Huang (2019)提出的演算法,它應用假設檢定由資料中尋找節點;其二參考Zhou and Shen (2001)的節點初始設定概念,並尋找節點最佳位置,同時,將兩種演算法推廣至雙變量可加性模型。依模擬結果可知,誤差大的單變量資料,以第一種演算法估計效果較佳且具高穩定度,但執行效率略慢;而誤差小的單變量資料,第二種演算法的估計效果有機會更佳,且兼具高穩定度與高執行效率。至於雙變量資料,若其中有較多反曲點,以第一種演算法估計效果較佳;反之,則適合以第二種演算法估計,然而,就執行效率而言,兩種演算法皆耗費多時。
This thesis focuses on the topic of knot selection for B-Spline regression model. Two algorithms, Algorithm 1 and Algorithm 2, are compared in terms of estimation accuracy, stability and computational cost. Algorithm 1 is based on the algorithm in Huang (2019), searching knots from data through statistical hypothesis. Algorithm 2 is based on the knot initialization in Zhou and Shen 2001). Furthermore, the two algorithms are extended to fit a bivariate additive model. According to the results of simulation studies, for univariate data with large errors, Algorithm 1 has better estimation accuracy, higher stability but is more computationally expensive; on the other hand, Algorithm 2 has better estimation accuracy, higher stability and is less computationally expensive when the errors become small. As for bivariate data, Algorithm 1 performs better than Algorithm 2 when the regression function has many reflection points. On the contrary, Algorithm 2 performs better when the regression function is smooth. However, this two algorithms are both time consuming.
參考文獻 Akaike, H. (1978). A New Look at the Bayes Procedure. Biometrika, 65(1), 53-59.

Beliakov, G. (2004). Cutting angle method - a tool for constrained global optimization. Optimization Methods and Software, 19(2), 137-151.

Biller, C. (2000). Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models. Journal of Computational and Graphical Statistics, 9(1), 122-140.

Breiman, L. (1991). The II Method for Estimating Multivariate Functions from Noisy Data. Technometrics, 33(2), 125-143.

Brenna, T. (2001). Defense Technical Information Center Compilation Part Notice ADP 013743 TITLE : Knot Removal for Tensor Product Splines.

Burchard, H.G. (1974). Splines (with optimal knots) are better. Applicable Analysis,3(4), 309-319.

David L. B. Jupp. (1978). Approximation to Data by Splines with Free Knots. SIAM Journal on Numerical Analysis, 15(2), 328-343.

de Boor, C. (1978). A Practical Guide to Splines. New York: Springer.

Denison, D., Mallick, B., and Smith, A. (1998). Automatic Bayesian Curve Fitting. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 60(2), 333-350.

DiMatteo, I., Genovese, C., and Kass, R. (2001). Bayesian Curve-Fitting with Free-Knot Splines. Biometrika, 88(4), 1055-1071.

Friedman, J. (1991). Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1), 1-67.

Friedman, J., and Silverman, B. (1989). Flexible Parsimonious Smoothing and Additive Modeling. Technometrics, 31(1), 3-21.

Green, P. (1995). Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination. Biometrika, 82(4), 711-732.

H. Park. (2010). B-spline surface fitting based on adaptive knot placement using dominant columns. Computer-Aided Design, 43(3):258-264.

H. Park., and J. Lee. (2007). B-spline curve fitting based on adaptive curve refinement using dominant points. Computer-Aided Design, 39(6), 439-451.

Halpern, E. (1973). Bayesian Spline Regression When the Number of Knots is Unknown. Journal of the Royal Statistical Society. Series B (Methodological), 35(2), 347-360.

He, Xu., and Shi, P. (1997). Bivariate Tensor-Product B-Splines in a Partly Linear Model. Jounal of Multivarivate Analysis, 58(2), 162-181.

Huang, T. M. (2014). An adaptive knot selection method for regression splines via penalized minimum contrast estimation. The 3rd Institute of Mathematical statistics Asia Pacific Rim Meeting (2014 IMS APRM).

Huang, T. M. (2019, August). A knot selection algorithm for regression splines. 62nd ISI World Statistics Congress, Kuala Lumpur.

Huang, T. M. (in press). An estimation procedure with knot selection for multivariate regression
splines. International Conference on Frontiers of Artificial Intelligence and Statistics (CFAIS 2020).

Jones, M. (1989). Journal of the Royal Statistical Society. Series A (Statistics in Society), 152(1), 119-120.

Kaishev, V. K., Dimitrova, D. S., Haberman, S., and Verrall, R. J. (2016). Geometrically designed, variable knot regression splines. Computational Statistics, 31(3), 1079-1105.

Kartal-Koç, E., and Iyigun, C. (2014). Restructuring forward step of MARS algorithm using a new knot selection procedure based on a mapping approach. Journal of Global Optimization, 60, 79-102.

Keele, L. (2008). Semiparametric Regression for the Social Sciences. Location: Wiley, Chichester, UK. ISBN 978-0470319918.

Kooperberg, C., Stone, C., and Truong, Y. (1995). Hazard Regression. Journal of American Statistical Association, 90(429), 78-94.

Lindstrom, M. (1999). Penalized Estimation of Free-Knot Splines. Journal of Computational and Graphical Statistics, 8(2), 333-352.

Lyche, T., and Mørken, K. (1987). Knot removal for parametric B-spline curves and surfaces. Computer Aided Geometric Design, 4, 217-230.

Meyer, M. (2008). Inference Using Shape-Restricted Regression Splines. The Annals of Applied Statistics, 2(3), 1013-1033.

Miyata, S., $\\&$ Shen, X. (2003). Adaptive Free-Knot Splines. Journal of Computational and Graphical Statistics, 12(1), 197-213.

Molinari, N., Durand, J., and Sabatier, R. (2004). Bounded optimal knots for regression splines. Computational Statistics $\\&$ Data Analysis, 45(2), 159-178.

O`Sullivan, F., Yandell, B., and Raynor, W. (1986). Automatic Smoothing of Regression Functions in Generalized Linear Models. Journal of American Statistical Association, 81(393), 96-103.

Osborne, M., Presnell, B., and Turlach, B. (2000). On the LASSO and Its Dual. Journal of Computational and Graphical Statistics. , 9(2), 319-337.

Paul H. C. Eilers, and Marx, B. (1996). Flexible Smoothing with B-splines and Penalties. Statistical Science, 11(2), 89-102.

Ruppert, D. (2002). Selecting the Number of Knots for Penalized Splines. Journal of Computational and Graphical Statistics, 11(4), 735-757.

Schütze, T., and Schwetlick, H. (2003). Bivariate Free Knot Splines. BIT Numerical Mathematics, 43(1), 153-178.

Schoenberg, I. (1946). Contributions to the Problem of Approximation of Equidistant Data by Analytic Functions: Part A. On the Problem of Smoothing or Graduation. A First Class of Analytic Approximation Formulae. Quarterly of Applied Mathematics, 4(1), 45-99.

Schumaker, L. (1991). SIAM Review. 33(3), 502-502.

Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics, 6(2), 461-464.

Smith, M. and Kohn, R. (1996). Nonparametric Regression using Bayesian Variable Selection. Journal of Econometric, 75(2):317-343.

Stein, C. (1981). Estimation of the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9(6), 1135-1151.

Stone, C., Hansen, M., Kooperberg, C., and Truong, Y. (1997). Polynomial Splines and their Tensor Products in Extended Linear Modeling. The Annals of Statistics, 25(4), 1371-1425.

Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267-288.

Van Loock, W., Pipeleers, G., Schutter, J. and Swevers, J. (2011). A convex optimization approach to curve fitting with B-splines. IFAC Proceedings Volumes (IFAC-PapersOnline), 18, 2290-2295.

Wahba, G. (1977). A survey of some smoothing problems and the method of generalized cross-validation for solving them. University of Wisconsin, Department of Statistics.

Weller, F. (1994). B-Spline Surfaces with Knot Segments. Interner Bericht des Fachbereich Informatik.(249), 32.

Zhang, Y., Cao, J., and Chen, Z., Li, X. and Zeng, X. M. (2016). B-spline Surface Fitting with Knot Position Optimization. Computers $\\&$ Graphics, 58, 73-83.

Zhou, S., and Shen, X. (2001). Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes. Journal of American Statistical Association, 96(453), 247-259.
描述 碩士
國立政治大學
統計學系
107354007
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107354007
資料類型 thesis
dc.contributor.advisor 黃子銘zh_TW
dc.contributor.author (Authors) 王姿尹zh_TW
dc.contributor.author (Authors) Wang, Zih-Yinen_US
dc.creator (作者) 王姿尹zh_TW
dc.creator (作者) Wang, Zih-Yinen_US
dc.date (日期) 2019en_US
dc.date.accessioned 5-Feb-2020 17:07:25 (UTC+8)-
dc.date.available 5-Feb-2020 17:07:25 (UTC+8)-
dc.date.issued (上傳時間) 5-Feb-2020 17:07:25 (UTC+8)-
dc.identifier (Other Identifiers) G0107354007en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/128560-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 107354007zh_TW
dc.description.abstract (摘要) 本文聚焦於B-Spline迴歸模型之節點選取議題,分別以MSE(Mean Square Error)、MSE的變異數與節點估計時間為指標,比較兩種演算法估計效果、執行穩定度與執行效率之優劣。其一為Huang (2019)提出的演算法,它應用假設檢定由資料中尋找節點;其二參考Zhou and Shen (2001)的節點初始設定概念,並尋找節點最佳位置,同時,將兩種演算法推廣至雙變量可加性模型。依模擬結果可知,誤差大的單變量資料,以第一種演算法估計效果較佳且具高穩定度,但執行效率略慢;而誤差小的單變量資料,第二種演算法的估計效果有機會更佳,且兼具高穩定度與高執行效率。至於雙變量資料,若其中有較多反曲點,以第一種演算法估計效果較佳;反之,則適合以第二種演算法估計,然而,就執行效率而言,兩種演算法皆耗費多時。zh_TW
dc.description.abstract (摘要) This thesis focuses on the topic of knot selection for B-Spline regression model. Two algorithms, Algorithm 1 and Algorithm 2, are compared in terms of estimation accuracy, stability and computational cost. Algorithm 1 is based on the algorithm in Huang (2019), searching knots from data through statistical hypothesis. Algorithm 2 is based on the knot initialization in Zhou and Shen 2001). Furthermore, the two algorithms are extended to fit a bivariate additive model. According to the results of simulation studies, for univariate data with large errors, Algorithm 1 has better estimation accuracy, higher stability but is more computationally expensive; on the other hand, Algorithm 2 has better estimation accuracy, higher stability and is less computationally expensive when the errors become small. As for bivariate data, Algorithm 1 performs better than Algorithm 2 when the regression function has many reflection points. On the contrary, Algorithm 2 performs better when the regression function is smooth. However, this two algorithms are both time consuming.en_US
dc.description.tableofcontents 1 緒論 6
1.1 研究背景 6
1.2 研究動機與目的 7
1.3 近似單變量函數 7
1.4 近似雙變量函數 8

2 文獻回顧 10

3 研究方法 12
3.1 Algorithm 1 12
3.1.1 節點選取標準 13
3.1.2 推廣至雙變量可加性模型 14
3.3 Algorithm 2 17
3.2.1 節點選取標準 17
3.2.2 推廣至雙變量可加性模型 18

4 模擬與比較 21
4.1 單變量資料 21
4.1.1 當誤差ϵi變異數為0.1 22
4.1.2 當誤差ϵi變異數為0.01 24
4.2 雙變量資料 25

5 結論 28

參考文獻 29

附錄一 33
zh_TW
dc.format.extent 5223948 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107354007en_US
dc.subject (關鍵詞) B-Spline迴歸模型zh_TW
dc.subject (關鍵詞) 節點選取zh_TW
dc.subject (關鍵詞) B-Spline tensorzh_TW
dc.subject (關鍵詞) B-Spline regression modelen_US
dc.subject (關鍵詞) Knot selectionen_US
dc.subject (關鍵詞) B-Spline tensoren_US
dc.title (題名) 兩種基於B-Spline迴歸模型之節點選取演算法比較zh_TW
dc.title (題名) A comparative study of two knot selection algorithms for B-Spline regressionen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Akaike, H. (1978). A New Look at the Bayes Procedure. Biometrika, 65(1), 53-59.

Beliakov, G. (2004). Cutting angle method - a tool for constrained global optimization. Optimization Methods and Software, 19(2), 137-151.

Biller, C. (2000). Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models. Journal of Computational and Graphical Statistics, 9(1), 122-140.

Breiman, L. (1991). The II Method for Estimating Multivariate Functions from Noisy Data. Technometrics, 33(2), 125-143.

Brenna, T. (2001). Defense Technical Information Center Compilation Part Notice ADP 013743 TITLE : Knot Removal for Tensor Product Splines.

Burchard, H.G. (1974). Splines (with optimal knots) are better. Applicable Analysis,3(4), 309-319.

David L. B. Jupp. (1978). Approximation to Data by Splines with Free Knots. SIAM Journal on Numerical Analysis, 15(2), 328-343.

de Boor, C. (1978). A Practical Guide to Splines. New York: Springer.

Denison, D., Mallick, B., and Smith, A. (1998). Automatic Bayesian Curve Fitting. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 60(2), 333-350.

DiMatteo, I., Genovese, C., and Kass, R. (2001). Bayesian Curve-Fitting with Free-Knot Splines. Biometrika, 88(4), 1055-1071.

Friedman, J. (1991). Multivariate Adaptive Regression Splines. The Annals of Statistics, 19(1), 1-67.

Friedman, J., and Silverman, B. (1989). Flexible Parsimonious Smoothing and Additive Modeling. Technometrics, 31(1), 3-21.

Green, P. (1995). Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination. Biometrika, 82(4), 711-732.

H. Park. (2010). B-spline surface fitting based on adaptive knot placement using dominant columns. Computer-Aided Design, 43(3):258-264.

H. Park., and J. Lee. (2007). B-spline curve fitting based on adaptive curve refinement using dominant points. Computer-Aided Design, 39(6), 439-451.

Halpern, E. (1973). Bayesian Spline Regression When the Number of Knots is Unknown. Journal of the Royal Statistical Society. Series B (Methodological), 35(2), 347-360.

He, Xu., and Shi, P. (1997). Bivariate Tensor-Product B-Splines in a Partly Linear Model. Jounal of Multivarivate Analysis, 58(2), 162-181.

Huang, T. M. (2014). An adaptive knot selection method for regression splines via penalized minimum contrast estimation. The 3rd Institute of Mathematical statistics Asia Pacific Rim Meeting (2014 IMS APRM).

Huang, T. M. (2019, August). A knot selection algorithm for regression splines. 62nd ISI World Statistics Congress, Kuala Lumpur.

Huang, T. M. (in press). An estimation procedure with knot selection for multivariate regression
splines. International Conference on Frontiers of Artificial Intelligence and Statistics (CFAIS 2020).

Jones, M. (1989). Journal of the Royal Statistical Society. Series A (Statistics in Society), 152(1), 119-120.

Kaishev, V. K., Dimitrova, D. S., Haberman, S., and Verrall, R. J. (2016). Geometrically designed, variable knot regression splines. Computational Statistics, 31(3), 1079-1105.

Kartal-Koç, E., and Iyigun, C. (2014). Restructuring forward step of MARS algorithm using a new knot selection procedure based on a mapping approach. Journal of Global Optimization, 60, 79-102.

Keele, L. (2008). Semiparametric Regression for the Social Sciences. Location: Wiley, Chichester, UK. ISBN 978-0470319918.

Kooperberg, C., Stone, C., and Truong, Y. (1995). Hazard Regression. Journal of American Statistical Association, 90(429), 78-94.

Lindstrom, M. (1999). Penalized Estimation of Free-Knot Splines. Journal of Computational and Graphical Statistics, 8(2), 333-352.

Lyche, T., and Mørken, K. (1987). Knot removal for parametric B-spline curves and surfaces. Computer Aided Geometric Design, 4, 217-230.

Meyer, M. (2008). Inference Using Shape-Restricted Regression Splines. The Annals of Applied Statistics, 2(3), 1013-1033.

Miyata, S., $\\&$ Shen, X. (2003). Adaptive Free-Knot Splines. Journal of Computational and Graphical Statistics, 12(1), 197-213.

Molinari, N., Durand, J., and Sabatier, R. (2004). Bounded optimal knots for regression splines. Computational Statistics $\\&$ Data Analysis, 45(2), 159-178.

O`Sullivan, F., Yandell, B., and Raynor, W. (1986). Automatic Smoothing of Regression Functions in Generalized Linear Models. Journal of American Statistical Association, 81(393), 96-103.

Osborne, M., Presnell, B., and Turlach, B. (2000). On the LASSO and Its Dual. Journal of Computational and Graphical Statistics. , 9(2), 319-337.

Paul H. C. Eilers, and Marx, B. (1996). Flexible Smoothing with B-splines and Penalties. Statistical Science, 11(2), 89-102.

Ruppert, D. (2002). Selecting the Number of Knots for Penalized Splines. Journal of Computational and Graphical Statistics, 11(4), 735-757.

Schütze, T., and Schwetlick, H. (2003). Bivariate Free Knot Splines. BIT Numerical Mathematics, 43(1), 153-178.

Schoenberg, I. (1946). Contributions to the Problem of Approximation of Equidistant Data by Analytic Functions: Part A. On the Problem of Smoothing or Graduation. A First Class of Analytic Approximation Formulae. Quarterly of Applied Mathematics, 4(1), 45-99.

Schumaker, L. (1991). SIAM Review. 33(3), 502-502.

Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics, 6(2), 461-464.

Smith, M. and Kohn, R. (1996). Nonparametric Regression using Bayesian Variable Selection. Journal of Econometric, 75(2):317-343.

Stein, C. (1981). Estimation of the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9(6), 1135-1151.

Stone, C., Hansen, M., Kooperberg, C., and Truong, Y. (1997). Polynomial Splines and their Tensor Products in Extended Linear Modeling. The Annals of Statistics, 25(4), 1371-1425.

Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267-288.

Van Loock, W., Pipeleers, G., Schutter, J. and Swevers, J. (2011). A convex optimization approach to curve fitting with B-splines. IFAC Proceedings Volumes (IFAC-PapersOnline), 18, 2290-2295.

Wahba, G. (1977). A survey of some smoothing problems and the method of generalized cross-validation for solving them. University of Wisconsin, Department of Statistics.

Weller, F. (1994). B-Spline Surfaces with Knot Segments. Interner Bericht des Fachbereich Informatik.(249), 32.

Zhang, Y., Cao, J., and Chen, Z., Li, X. and Zeng, X. M. (2016). B-spline Surface Fitting with Knot Position Optimization. Computers $\\&$ Graphics, 58, 73-83.

Zhou, S., and Shen, X. (2001). Spatially Adaptive Regression Splines and Accurate Knot Selection Schemes. Journal of American Statistical Association, 96(453), 247-259.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202000007en_US