學術產出-學位論文
文章檢視/開啟
書目匯出
-
題名 無母數主成份迴歸與向前迴歸的比較
Nonparametric Principal Components Regression Compared with Forward Regression作者 陳弘叡
Chen, Hong Rui貢獻者 黃子銘
Huang, Tzee Ming
陳弘叡
Chen, Hong Rui關鍵詞 變數選取
主成份分析法
向前選取法
無母數加成迴歸模型
backfitting algorithm
B-Spline
積分絕對誤差
Variable Selection
Principal Component Analysis
Forward Selection
Nonparametric Additive Model
Backfitting Alogorithm
B-Spline
Integral Absolute Error日期 2016 上傳時間 22-八月-2016 10:42:31 (UTC+8) 摘要 在一般線性迴歸模型當中,當樣本數大於變數個數時候,我們是以最小平方法來估計迴歸參數。然而當變數個數大於樣本個數時,會造成共線性問題,導致參數無法估計,無法確認個別自變數對依變數有多大影響。為了解決共線性問題,我們透過變數選取來選取重要的變數,選取方法包含主成份分析法 (PCA)及向前選取法 (FS).我們使用的模型為無母數加成迴歸模型,透過 backfitting algorithm 來估計整個迴歸函數,個別函數則以無母數方法,使用B-Spline 來估計。我們把兩種選取方法應用在無母數加成模型裡,以積分絕對誤差為判斷標準,透過不同變數及不同生成模型類型的模擬,來判斷哪種配適選取方法較合適。模擬結果可以發現,多數情況下,FS的表現比PCA來得好。
In a general linear regression model, when the sample size $n$ is greater than the number of variables $p$, it is common to use the least squares method to estimate the parameters in the regression model. When $n<p$, the parameters in the regression model cannot be estimated due to collinearity, so it is necessary to perform variable selection before estimating the parameters. In this thesis, I compare two variable selection methods in nonparametric additive regression. The first method is based on principal component analysis (PCA), and the second method is based on forward selection (FS). The integrated absolute error is used to evaluate the performance of these two methods in simulation studies. The simulation results show that FS performs better than PCA in most cases.參考文獻 [1]陳順宇.多變量分析,四版,華泰書局.收稿日期:民國99年,10,2005.[2]ZarahKatharinaSaxilAndersen.Boligpriseffektenafdenkben-havnskemetro.2009.[3]ErnielBarriosandJenniferUmali.Nonparametricprincipalcomponentsregression.InProceedingsofthe58thWorldCongressoftheInternationalStatisticalInstitute.[4]AndreasBuja,TrevorHastie,andRobertTibshirani.Linearsmoothersandadditivemodels.TheAnnalsofStatistics,pages453–510,1989.[5]NormanCliff.Theeigenvalues-greater-than-oneruleandthereliabilityofcomponents.Psychologicalbulletin,103(2):276,1988.[6]CarlDeBoor.Oncalculatingwithb-splines.JournalofAp-proximationtheory,6(1):50–62,1972.[7]MAEfroymson.Multipleregressionanalysis.Mathematicalmethodsfordigitalcomputers,1:191–203,1960.[8]JianqingFanandJianchengJiang.Generalizedlikelihoodratiotestsforadditivemodels.2005.[9]JeromeHFriedmanandWernerStuetzle.Projectionpursuitregression.JournaloftheAmericanstatisticalAssociation,76(376):817–823,1981.[10]HaroldHotelling.Analysisofacomplexofstatisticalvariablesintoprincipalcomponents.Journalofeducationalpsychology,24(6):417,1933.[11]KarlPearson.Liii.onlinesandplanesofclosestfittosys-temsofpointsinspace.TheLondon,Edinburgh,andDublinPhilosophicalMagazineandJournalofScience,2(11):559–572,1901.[12]IsaacJacobSchoenberg.Contributionstotheproblemofap-proximationofequidistantdatabyanalyticfunctions:Partb—ontheproblemofosculatoryinterpolation.asecondclassofanalyticapproximationformulae.QuarterlyofAppliedMathe-matics,4(2):112–141,1946. 描述 碩士
國立政治大學
統計學系
103354028資料來源 http://thesis.lib.nccu.edu.tw/record/#G0103354028 資料類型 thesis dc.contributor.advisor 黃子銘 zh_TW dc.contributor.advisor Huang, Tzee Ming en_US dc.contributor.author (作者) 陳弘叡 zh_TW dc.contributor.author (作者) Chen, Hong Rui en_US dc.creator (作者) 陳弘叡 zh_TW dc.creator (作者) Chen, Hong Rui en_US dc.date (日期) 2016 en_US dc.date.accessioned 22-八月-2016 10:42:31 (UTC+8) - dc.date.available 22-八月-2016 10:42:31 (UTC+8) - dc.date.issued (上傳時間) 22-八月-2016 10:42:31 (UTC+8) - dc.identifier (其他 識別碼) G0103354028 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/100453 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計學系 zh_TW dc.description (描述) 103354028 zh_TW dc.description.abstract (摘要) 在一般線性迴歸模型當中,當樣本數大於變數個數時候,我們是以最小平方法來估計迴歸參數。然而當變數個數大於樣本個數時,會造成共線性問題,導致參數無法估計,無法確認個別自變數對依變數有多大影響。為了解決共線性問題,我們透過變數選取來選取重要的變數,選取方法包含主成份分析法 (PCA)及向前選取法 (FS).我們使用的模型為無母數加成迴歸模型,透過 backfitting algorithm 來估計整個迴歸函數,個別函數則以無母數方法,使用B-Spline 來估計。我們把兩種選取方法應用在無母數加成模型裡,以積分絕對誤差為判斷標準,透過不同變數及不同生成模型類型的模擬,來判斷哪種配適選取方法較合適。模擬結果可以發現,多數情況下,FS的表現比PCA來得好。 zh_TW dc.description.abstract (摘要) In a general linear regression model, when the sample size $n$ is greater than the number of variables $p$, it is common to use the least squares method to estimate the parameters in the regression model. When $n<p$, the parameters in the regression model cannot be estimated due to collinearity, so it is necessary to perform variable selection before estimating the parameters. In this thesis, I compare two variable selection methods in nonparametric additive regression. The first method is based on principal component analysis (PCA), and the second method is based on forward selection (FS). The integrated absolute error is used to evaluate the performance of these two methods in simulation studies. The simulation results show that FS performs better than PCA in most cases. en_US dc.description.tableofcontents 1 緒論.............52 文獻探討.........72.1 加成模型及Backfitting Algorithm...72.2 B-Spline迴歸模型.........83 研究方法........103.1 主成份分析法(PCA)........103.1.1 定義........103.1.2 主成份計算方式.........103.1.3 選取標準..............113.2 向前選取法(FS)..........123.2.1 定義......123.2.2 檢定加成模型的變數是否顯著...123.2.3 選取標準.......144 模擬資料與比較結果........154.1 模擬資料......154.1.1 實驗一,變數個數p=20,n>p,模擬10次....174.1.2 實驗二,變數個數p=200,n=p,模擬10次...214.1.3 實驗三,變數個數p=250,n 4.2 比較結果........265 結論與建議........282.1 結論...........282.2 建議...........28附錄.......30
zh_TW dc.format.extent 963090 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0103354028 en_US dc.subject (關鍵詞) 變數選取 zh_TW dc.subject (關鍵詞) 主成份分析法 zh_TW dc.subject (關鍵詞) 向前選取法 zh_TW dc.subject (關鍵詞) 無母數加成迴歸模型 zh_TW dc.subject (關鍵詞) backfitting algorithm zh_TW dc.subject (關鍵詞) B-Spline zh_TW dc.subject (關鍵詞) 積分絕對誤差 zh_TW dc.subject (關鍵詞) Variable Selection en_US dc.subject (關鍵詞) Principal Component Analysis en_US dc.subject (關鍵詞) Forward Selection en_US dc.subject (關鍵詞) Nonparametric Additive Model en_US dc.subject (關鍵詞) Backfitting Alogorithm en_US dc.subject (關鍵詞) B-Spline en_US dc.subject (關鍵詞) Integral Absolute Error en_US dc.title (題名) 無母數主成份迴歸與向前迴歸的比較 zh_TW dc.title (題名) Nonparametric Principal Components Regression Compared with Forward Regression en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1]陳順宇.多變量分析,四版,華泰書局.收稿日期:民國99年,10,2005.[2]ZarahKatharinaSaxilAndersen.Boligpriseffektenafdenkben-havnskemetro.2009.[3]ErnielBarriosandJenniferUmali.Nonparametricprincipalcomponentsregression.InProceedingsofthe58thWorldCongressoftheInternationalStatisticalInstitute.[4]AndreasBuja,TrevorHastie,andRobertTibshirani.Linearsmoothersandadditivemodels.TheAnnalsofStatistics,pages453–510,1989.[5]NormanCliff.Theeigenvalues-greater-than-oneruleandthereliabilityofcomponents.Psychologicalbulletin,103(2):276,1988.[6]CarlDeBoor.Oncalculatingwithb-splines.JournalofAp-proximationtheory,6(1):50–62,1972.[7]MAEfroymson.Multipleregressionanalysis.Mathematicalmethodsfordigitalcomputers,1:191–203,1960.[8]JianqingFanandJianchengJiang.Generalizedlikelihoodratiotestsforadditivemodels.2005.[9]JeromeHFriedmanandWernerStuetzle.Projectionpursuitregression.JournaloftheAmericanstatisticalAssociation,76(376):817–823,1981.[10]HaroldHotelling.Analysisofacomplexofstatisticalvariablesintoprincipalcomponents.Journalofeducationalpsychology,24(6):417,1933.[11]KarlPearson.Liii.onlinesandplanesofclosestfittosys-temsofpointsinspace.TheLondon,Edinburgh,andDublinPhilosophicalMagazineandJournalofScience,2(11):559–572,1901.[12]IsaacJacobSchoenberg.Contributionstotheproblemofap-proximationofequidistantdatabyanalyticfunctions:Partb—ontheproblemofosculatoryinterpolation.asecondclassofanalyticapproximationformulae.QuarterlyofAppliedMathe-matics,4(2):112–141,1946. zh_TW