學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 基於主成分分析之Spline迴歸節點選取演算法
An Algorithm for Knot Selection in Spline Regression Based on Principal Component Analysis
作者 蔡耀德
Tsai, Yao-De
貢獻者 黃子銘
Huang, Tzee-Ming
蔡耀德
Tsai, Yao-De
關鍵詞 函數近似
無母數迴歸
Spline 函數
主成分分析
節點選取
非固定節點
Function approximation
Nonparametric regression
Spline function
Principal component analysis
Knot selection
Free knots
日期 2023
上傳時間 1-Sep-2023 14:56:48 (UTC+8)
摘要 在函數近似的方法中,我們可以使用無母數迴歸進行近似,其中常以Spline函數為做為基底函數進行函數估計。而在Spline函數的建構中,節點的選取相當重要,其位置與個數將直接影響函數的估計
效果。因此本文以冪基底及截斷冪基底作為Spline函數的基底,提出基於主成分分析(PCA)的節點選取方法。在選取節點時,依照設定間距δ,將測試節點分為重要節點及次要節點,其後將次要節點效應進行PCA,並排除應變數受次要節點之主成分的影響,依照貝氏資訊量準則(BIC)逐步挑選重要節點,最後設定不同的δ找出能使模型BIC最小的重要節點列表。在實驗數據中,將本文方法與向前選取法、Huang(2019)的方法在B-spline函數以及四個訊號處理的函數進行比較,透過Wilcoxon配對秩檢定可知本文提出的方法在B-spline函數中樣本數小時優於另外兩種方法,且在四個訊號函數中不論樣本數大小優於另外兩種方法,然而在樣本數多時明顯需要耗費較多時間。
In nonparametric regression, it is common to first approximate the regression function using functions known up to a finite dimensional parameter and then estimate the unknown parameter to obtain the regression function estimator. One commonly used class of functions for function approximation is the class of splines. When using a spline for function approximation, the position and number of knots of the spline crucially affect the approximation performance. Therefore, in this thesis, a knot selection method based on principal component analysis (PCA) is proposed.
The proposed knot selection method involves testing whether knots at specific locations need to be placed. During the knot selection process, splines are expressed as linear combinations of power basis functions and truncated power basis functions based on the test knots, and the test knots are divided into important and minor knots based on a specified spacing parameter δ.
Subsequently, PCA is applied to the effects of minor knots, and a test is performed to test the significance of a test knot after the influence of principal components of minor knots on the dependent variable is eliminated. Important knots are then selected stepwise according to the Bayesian Information Criterion (BIC).
Finally, finding the set of important knots that
minimizes the BIC of the model by considering a set of δ`s.
Simulation experiments have been carried out to compare the proposed method with forward selection and Huang`s method (2019), where the regression function is either a simulated spline function or one of four specific functions used in signal processing.
The experiment results indicate that the proposed method outperforms the
other two methods when the sample size is small and the regression function is a spline, or when the regression function is one of the four given functions, regardless of the sample size. The comparison is based on Wilcoxon`s matched pairs rank test.
However, the proposed method is time-consuming when the sample size is large.
參考文獻 Bartels, R. H., Beatty, J. C., and Barsky, B. A. (1995). An introduction to splines for use
in computer graphics and geometric modeling. Morgan Kaufmann.
Dani, A., Ratnasari, V., and Budiantara, I. (2021). Optimal knots point and bandwidth selection in modeling mixed estimator nonparametric regression. IOP Conference Series:
Materials Science and Engineering, 1115(1):012020.
Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage.
biometrika, 81(3):425–455.
Eubank, R. L. (1999). Nonparametric regression and spline smoothing. CRC press.
Fan, J., Lou, Z., and Yu, M. (2023). Are latent factor regression and sparse regression
adequate? Journal of the American Statistical Association, pages 1–13.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics,
19(1):1–67.
George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal
of the American Statistical Association, 88(423):881–889.
Huang, T. M. (2019). A knot selection algorithm for regression splines. Proceedings of
the 62th ISI World Statistics Congress, Contributed Paper Session, 2:372–377.
Ratnasari, V., Budiantara, I., Ratna, M., and Zain, I. (2016). Estimation of nonparametric
regression curve using mixed estimator of multivariable truncated spline and multivariable kernel. Global Journal of Pure and Applied Mathematics, 12(6):5047–5057.
Wang, R.-H. (2013). Multivariate spline functions and their applications, volume 529.
Springer Science & Business Media.
Zhou, S. and Shen, X. (2001). Spatially adaptive regression splines and accurate knot
selection schemes. Journal of the American Statistical Association, 96(453):247–259.
描述 碩士
國立政治大學
統計學系
110354016
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0110354016
資料類型 thesis
dc.contributor.advisor 黃子銘zh_TW
dc.contributor.advisor Huang, Tzee-Mingen_US
dc.contributor.author (Authors) 蔡耀德zh_TW
dc.contributor.author (Authors) Tsai, Yao-Deen_US
dc.creator (作者) 蔡耀德zh_TW
dc.creator (作者) Tsai, Yao-Deen_US
dc.date (日期) 2023en_US
dc.date.accessioned 1-Sep-2023 14:56:48 (UTC+8)-
dc.date.available 1-Sep-2023 14:56:48 (UTC+8)-
dc.date.issued (上傳時間) 1-Sep-2023 14:56:48 (UTC+8)-
dc.identifier (Other Identifiers) G0110354016en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/146902-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 110354016zh_TW
dc.description.abstract (摘要) 在函數近似的方法中,我們可以使用無母數迴歸進行近似,其中常以Spline函數為做為基底函數進行函數估計。而在Spline函數的建構中,節點的選取相當重要,其位置與個數將直接影響函數的估計
效果。因此本文以冪基底及截斷冪基底作為Spline函數的基底,提出基於主成分分析(PCA)的節點選取方法。在選取節點時,依照設定間距δ,將測試節點分為重要節點及次要節點,其後將次要節點效應進行PCA,並排除應變數受次要節點之主成分的影響,依照貝氏資訊量準則(BIC)逐步挑選重要節點,最後設定不同的δ找出能使模型BIC最小的重要節點列表。在實驗數據中,將本文方法與向前選取法、Huang(2019)的方法在B-spline函數以及四個訊號處理的函數進行比較,透過Wilcoxon配對秩檢定可知本文提出的方法在B-spline函數中樣本數小時優於另外兩種方法,且在四個訊號函數中不論樣本數大小優於另外兩種方法,然而在樣本數多時明顯需要耗費較多時間。
zh_TW
dc.description.abstract (摘要) In nonparametric regression, it is common to first approximate the regression function using functions known up to a finite dimensional parameter and then estimate the unknown parameter to obtain the regression function estimator. One commonly used class of functions for function approximation is the class of splines. When using a spline for function approximation, the position and number of knots of the spline crucially affect the approximation performance. Therefore, in this thesis, a knot selection method based on principal component analysis (PCA) is proposed.
The proposed knot selection method involves testing whether knots at specific locations need to be placed. During the knot selection process, splines are expressed as linear combinations of power basis functions and truncated power basis functions based on the test knots, and the test knots are divided into important and minor knots based on a specified spacing parameter δ.
Subsequently, PCA is applied to the effects of minor knots, and a test is performed to test the significance of a test knot after the influence of principal components of minor knots on the dependent variable is eliminated. Important knots are then selected stepwise according to the Bayesian Information Criterion (BIC).
Finally, finding the set of important knots that
minimizes the BIC of the model by considering a set of δ`s.
Simulation experiments have been carried out to compare the proposed method with forward selection and Huang`s method (2019), where the regression function is either a simulated spline function or one of four specific functions used in signal processing.
The experiment results indicate that the proposed method outperforms the
other two methods when the sample size is small and the regression function is a spline, or when the regression function is one of the four given functions, regardless of the sample size. The comparison is based on Wilcoxon`s matched pairs rank test.
However, the proposed method is time-consuming when the sample size is large.
en_US
dc.description.tableofcontents 摘要 i
Abstract ii
目次 iii
圖目錄 iv
表目錄 vi

第一章 緒論 1
第一節 Spline函數介紹 2

第二章 文獻回顧 5

第三章 研究方法 6
第一節 模型假設 6
第二節 FARM 模型介紹 7
第三節 基於主成分分析之節點選取方法 8

第四章 模擬數據 11
第一節 使用B-spline函數生成數據 12
4.1.1 節點數少的情形 13
4.1.2 節點數多的情形 15
第二節 使用四個訊號函數生成數據 17
4.2.1 Blocks函數 17
4.2.2 Bumps函數 20
4.2.3 HeaviSine函數 23
4.2.4 Doppler函數 25
第三節 估計成效分析 27
4.3.1 本文方法中間隔δ之選取情形 27
4.3.2 估計誤差分析 28

第五章 總結與建議 29

附錄 30
參考文獻 38
zh_TW
dc.format.extent 3614098 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0110354016en_US
dc.subject (關鍵詞) 函數近似zh_TW
dc.subject (關鍵詞) 無母數迴歸zh_TW
dc.subject (關鍵詞) Spline 函數zh_TW
dc.subject (關鍵詞) 主成分分析zh_TW
dc.subject (關鍵詞) 節點選取zh_TW
dc.subject (關鍵詞) 非固定節點zh_TW
dc.subject (關鍵詞) Function approximationen_US
dc.subject (關鍵詞) Nonparametric regressionen_US
dc.subject (關鍵詞) Spline functionen_US
dc.subject (關鍵詞) Principal component analysisen_US
dc.subject (關鍵詞) Knot selectionen_US
dc.subject (關鍵詞) Free knotsen_US
dc.title (題名) 基於主成分分析之Spline迴歸節點選取演算法zh_TW
dc.title (題名) An Algorithm for Knot Selection in Spline Regression Based on Principal Component Analysisen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) Bartels, R. H., Beatty, J. C., and Barsky, B. A. (1995). An introduction to splines for use
in computer graphics and geometric modeling. Morgan Kaufmann.
Dani, A., Ratnasari, V., and Budiantara, I. (2021). Optimal knots point and bandwidth selection in modeling mixed estimator nonparametric regression. IOP Conference Series:
Materials Science and Engineering, 1115(1):012020.
Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage.
biometrika, 81(3):425–455.
Eubank, R. L. (1999). Nonparametric regression and spline smoothing. CRC press.
Fan, J., Lou, Z., and Yu, M. (2023). Are latent factor regression and sparse regression
adequate? Journal of the American Statistical Association, pages 1–13.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The annals of statistics,
19(1):1–67.
George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal
of the American Statistical Association, 88(423):881–889.
Huang, T. M. (2019). A knot selection algorithm for regression splines. Proceedings of
the 62th ISI World Statistics Congress, Contributed Paper Session, 2:372–377.
Ratnasari, V., Budiantara, I., Ratna, M., and Zain, I. (2016). Estimation of nonparametric
regression curve using mixed estimator of multivariable truncated spline and multivariable kernel. Global Journal of Pure and Applied Mathematics, 12(6):5047–5057.
Wang, R.-H. (2013). Multivariate spline functions and their applications, volume 529.
Springer Science & Business Media.
Zhou, S. and Shen, X. (2001). Spatially adaptive regression splines and accurate knot
selection schemes. Journal of the American Statistical Association, 96(453):247–259.
zh_TW