學術產出-學位論文

文章檢視/開啟

書目匯出

Google ScholarTM

政大圖書館

引文資訊

TAIR相關學術產出

題名 無母數主成份迴歸與向前迴歸的比較
Nonparametric Principal Components Regression Compared with Forward Regression
作者 陳弘叡
Chen, Hong Rui
貢獻者 黃子銘
Huang, Tzee Ming
陳弘叡
Chen, Hong Rui
關鍵詞 變數選取
主成份分析法
向前選取法
無母數加成迴歸模型
backfitting algorithm
B-Spline
積分絕對誤差
Variable Selection
Principal Component Analysis
Forward Selection
Nonparametric Additive Model
Backfitting Alogorithm
B-Spline
Integral Absolute Error
日期 2016
上傳時間 22-八月-2016 10:42:31 (UTC+8)
摘要 在一般線性迴歸模型當中,當樣本數大於變數個數時候,我們是以最小平方法來估計迴歸參數。然而當變數個數大於樣本個數時,會造成共線性問題,導致參數無法估計,無法確認個別自變數對依變數有多大影響。為了解決共線性問題,我們透過變數選取來選取重要的變數,選取方法包含主成份分析法 (PCA)及向前選取法 (FS).

我們使用的模型為無母數加成迴歸模型,透過 backfitting algorithm 來估計整個迴歸函數,個別函數則以無母數方法,使用B-Spline 來估計。我們把兩種選取方法應用在無母數加成模型裡,以積分絕對誤差為判斷標準,透過不同變數及不同生成模型類型的模擬,來判斷哪種配適選取方法較合適。模擬結果可以發現,多數情況下,FS的表現比PCA來得好。
In a general linear regression model, when the sample size $n$ is greater than the number of variables $p$, it is common to use the least squares method to estimate the parameters in the regression model. When $n<p$, the parameters in the regression model cannot be estimated due to collinearity, so it is necessary to perform variable selection before estimating the parameters. In this thesis, I compare two variable selection methods in nonparametric additive regression. The first method is based on principal component analysis (PCA), and the second method is based on forward selection (FS). The integrated absolute error is used to evaluate the performance of these two methods in simulation studies. The simulation results show that FS performs better than PCA in most cases.
參考文獻 [1]陳順宇.多變量分析,四版,華泰書局.收稿日期:民國99年,10,2005.
[2]ZarahKatharinaSaxilAndersen.Boligpriseffektenafdenkben-havnskemetro.2009.
[3]ErnielBarriosandJenniferUmali.Nonparametricprincipalcomponentsregression.InProceedingsofthe58thWorldCongressoftheInternationalStatisticalInstitute.
[4]AndreasBuja,TrevorHastie,andRobertTibshirani.Linearsmoothersandadditivemodels.TheAnnalsofStatistics,pages453–510,1989.
[5]NormanCliff.Theeigenvalues-greater-than-oneruleandthereliabilityofcomponents.Psychologicalbulletin,103(2):276,1988.
[6]CarlDeBoor.Oncalculatingwithb-splines.JournalofAp-proximationtheory,6(1):50–62,1972.
[7]MAEfroymson.Multipleregressionanalysis.Mathematicalmethodsfordigitalcomputers,1:191–203,1960.
[8]JianqingFanandJianchengJiang.Generalizedlikelihoodratiotestsforadditivemodels.2005.
[9]JeromeHFriedmanandWernerStuetzle.Projectionpursuitregression.JournaloftheAmericanstatisticalAssociation,76(376):817–823,1981.
[10]HaroldHotelling.Analysisofacomplexofstatisticalvariablesintoprincipalcomponents.Journalofeducationalpsychology,24(6):417,1933.
[11]KarlPearson.Liii.onlinesandplanesofclosestfittosys-temsofpointsinspace.TheLondon,Edinburgh,andDublinPhilosophicalMagazineandJournalofScience,2(11):559–572,1901.
[12]IsaacJacobSchoenberg.Contributionstotheproblemofap-proximationofequidistantdatabyanalyticfunctions:Partb—ontheproblemofosculatoryinterpolation.asecondclassofanalyticapproximationformulae.QuarterlyofAppliedMathe-matics,4(2):112–141,1946.
描述 碩士
國立政治大學
統計學系
103354028
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0103354028
資料類型 thesis
dc.contributor.advisor 黃子銘zh_TW
dc.contributor.advisor Huang, Tzee Mingen_US
dc.contributor.author (作者) 陳弘叡zh_TW
dc.contributor.author (作者) Chen, Hong Ruien_US
dc.creator (作者) 陳弘叡zh_TW
dc.creator (作者) Chen, Hong Ruien_US
dc.date (日期) 2016en_US
dc.date.accessioned 22-八月-2016 10:42:31 (UTC+8)-
dc.date.available 22-八月-2016 10:42:31 (UTC+8)-
dc.date.issued (上傳時間) 22-八月-2016 10:42:31 (UTC+8)-
dc.identifier (其他 識別碼) G0103354028en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/100453-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 統計學系zh_TW
dc.description (描述) 103354028zh_TW
dc.description.abstract (摘要) 在一般線性迴歸模型當中,當樣本數大於變數個數時候,我們是以最小平方法來估計迴歸參數。然而當變數個數大於樣本個數時,會造成共線性問題,導致參數無法估計,無法確認個別自變數對依變數有多大影響。為了解決共線性問題,我們透過變數選取來選取重要的變數,選取方法包含主成份分析法 (PCA)及向前選取法 (FS).

我們使用的模型為無母數加成迴歸模型,透過 backfitting algorithm 來估計整個迴歸函數,個別函數則以無母數方法,使用B-Spline 來估計。我們把兩種選取方法應用在無母數加成模型裡,以積分絕對誤差為判斷標準,透過不同變數及不同生成模型類型的模擬,來判斷哪種配適選取方法較合適。模擬結果可以發現,多數情況下,FS的表現比PCA來得好。
zh_TW
dc.description.abstract (摘要) In a general linear regression model, when the sample size $n$ is greater than the number of variables $p$, it is common to use the least squares method to estimate the parameters in the regression model. When $n<p$, the parameters in the regression model cannot be estimated due to collinearity, so it is necessary to perform variable selection before estimating the parameters. In this thesis, I compare two variable selection methods in nonparametric additive regression. The first method is based on principal component analysis (PCA), and the second method is based on forward selection (FS). The integrated absolute error is used to evaluate the performance of these two methods in simulation studies. The simulation results show that FS performs better than PCA in most cases.en_US
dc.description.tableofcontents 1 緒論.............5
2 文獻探討.........7
2.1 加成模型及Backfitting Algorithm...7
2.2 B-Spline迴歸模型.........8
3 研究方法........10
3.1 主成份分析法(PCA)........10
3.1.1 定義........10
3.1.2 主成份計算方式.........10
3.1.3 選取標準..............11
3.2 向前選取法(FS)..........12
3.2.1 定義......12
3.2.2 檢定加成模型的變數是否顯著...12
3.2.3 選取標準.......14
4 模擬資料與比較結果........15
4.1 模擬資料......15
4.1.1 實驗一,變數個數p=20,n>p,模擬10次....17
4.1.2 實驗二,變數個數p=200,n=p,模擬10次...21
4.1.3 實驗三,變數個數p=250,n4.2 比較結果........26
5 結論與建議........28
2.1 結論...........28
2.2 建議...........28
附錄.......30
zh_TW
dc.format.extent 963090 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0103354028en_US
dc.subject (關鍵詞) 變數選取zh_TW
dc.subject (關鍵詞) 主成份分析法zh_TW
dc.subject (關鍵詞) 向前選取法zh_TW
dc.subject (關鍵詞) 無母數加成迴歸模型zh_TW
dc.subject (關鍵詞) backfitting algorithmzh_TW
dc.subject (關鍵詞) B-Splinezh_TW
dc.subject (關鍵詞) 積分絕對誤差zh_TW
dc.subject (關鍵詞) Variable Selectionen_US
dc.subject (關鍵詞) Principal Component Analysisen_US
dc.subject (關鍵詞) Forward Selectionen_US
dc.subject (關鍵詞) Nonparametric Additive Modelen_US
dc.subject (關鍵詞) Backfitting Alogorithmen_US
dc.subject (關鍵詞) B-Splineen_US
dc.subject (關鍵詞) Integral Absolute Erroren_US
dc.title (題名) 無母數主成份迴歸與向前迴歸的比較zh_TW
dc.title (題名) Nonparametric Principal Components Regression Compared with Forward Regressionen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1]陳順宇.多變量分析,四版,華泰書局.收稿日期:民國99年,10,2005.
[2]ZarahKatharinaSaxilAndersen.Boligpriseffektenafdenkben-havnskemetro.2009.
[3]ErnielBarriosandJenniferUmali.Nonparametricprincipalcomponentsregression.InProceedingsofthe58thWorldCongressoftheInternationalStatisticalInstitute.
[4]AndreasBuja,TrevorHastie,andRobertTibshirani.Linearsmoothersandadditivemodels.TheAnnalsofStatistics,pages453–510,1989.
[5]NormanCliff.Theeigenvalues-greater-than-oneruleandthereliabilityofcomponents.Psychologicalbulletin,103(2):276,1988.
[6]CarlDeBoor.Oncalculatingwithb-splines.JournalofAp-proximationtheory,6(1):50–62,1972.
[7]MAEfroymson.Multipleregressionanalysis.Mathematicalmethodsfordigitalcomputers,1:191–203,1960.
[8]JianqingFanandJianchengJiang.Generalizedlikelihoodratiotestsforadditivemodels.2005.
[9]JeromeHFriedmanandWernerStuetzle.Projectionpursuitregression.JournaloftheAmericanstatisticalAssociation,76(376):817–823,1981.
[10]HaroldHotelling.Analysisofacomplexofstatisticalvariablesintoprincipalcomponents.Journalofeducationalpsychology,24(6):417,1933.
[11]KarlPearson.Liii.onlinesandplanesofclosestfittosys-temsofpointsinspace.TheLondon,Edinburgh,andDublinPhilosophicalMagazineandJournalofScience,2(11):559–572,1901.
[12]IsaacJacobSchoenberg.Contributionstotheproblemofap-proximationofequidistantdatabyanalyticfunctions:Partb—ontheproblemofosculatoryinterpolation.asecondclassofanalyticapproximationformulae.QuarterlyofAppliedMathe-matics,4(2):112–141,1946.
zh_TW