學術產出-Theses
Article View/Open
Publication Export
-
題名 充分維度縮減於整體性檢定之應用
Application of sufficient dimension reduction to global test作者 徐碩亨
Hsu, Shuo Heng貢獻者 薛慧敏
Hsueh, Hui Min
徐碩亨
Hsu, Shuo Heng關鍵詞 維度縮減
切片平均變異數估計法
基因組分析
排列顯著值日期 2012 上傳時間 11-Jul-2013 16:37:19 (UTC+8) 摘要 隨著科技不斷的進步,人們需要處理的資料量也不斷地增加。在巨量資料的分析上,維度縮減將有助於增進效率。本篇論文主要介紹切片平均變異數估計維度縮減方法,並將此法應用於整體相關性檢定問題上。我們考慮切片平均變異數估計法中的邊際維度檢定,並將利用排列重抽法建構檢定統計量的虛無分配,藉此計算排列顯著值來獲得統計推論。此整體相關性檢定可用在基因組分析問題上,以驗證特定基因組與外顯特徵變數間的相關程度。最後我們將模擬本檢定的型一誤差率和檢定力,並與前人提出的方法做比較。 參考文獻 Auer, P.L. and Doerge, R.W. (2011) A Two-Stage Poisson Model for Testing RNA-Seq Data. Statistical Applications in Genetics and Molecular Biology, 10, 1.Bura, E. and Pfeiffer, R.M. (2003) Graphical methods for class prediction using dimension reduction techniques on DNA microarray data. Bioinformatics, 19, 1252-1258.Chen, J.J., Lee, T., Delonggchamp, R.R., Chen, T. and Tsao, C.A. (2007) Significance analysis of groups of genes in expression profiling studies. Bioinformatics, 23, 2104-2112.Cook, R.D. (1996) Graphics for regression with a binary response. Journal of the American Statistical Association, 91, 983-992.Cook, R.D. (1998) Regression Graphics, Ideas for Studying Regressions Through Graphics. New York:John Wiley.Cook, R.D. (2000) Save:a method for dimension reduction and graphics in regression. Communications in Statistics-Theory and Methods, 29, 2109-2121.Cook, R.D. (2004) Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.Cook, R.D. and Lee, H. (1999) Dimension reduction in binary response regression. Journal of the American Statistical Association, 94, 1187-1200.Cook, R.D. and Weisberg, S. (1991) Comment. Journal of the American Statistical Association, 86, 328-332.Dinu, I., Potter, J.D., Mueller, T., Liu, Q., Adewale, A.J., Jhangri, G.S., Einecke, G., Famulski, K.S., Halloran, P. and Yasui, Y. (2007) Improving gene set analysis of microarray data by SAM-GS. Bioinformatics, 8, 242.Efron, B. and Tibshirani, R. (2007) On testing the significance of sets of genes. The Annals of Applied Statistics, 1, 107-129.Hosmer, D.W. and Cessie, S.L. and Lemeshow, S. (1997) A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine, 16, 965-980.Li, K.C. (1991) Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association, 86, 316-327.Li, K.C. (1992) On Principal Hessian Directions for Data Visualization and Dimension Reduction:Another Application of Stein’s Lemma. Journal of the American Statistical Association, 87, 1025-1039.Liu, Q., Dinu, I., Adewale, A., Potter, J. and Yasui, Y. (2007) Comparative evaluation of gene-set analysis methods. Bioinformatics, 8, 431.Mootha, V.K., Lindgren, C.M., Eriksson,K.F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M.J., Patterson, N., Mesirov, J.P., Golub, T.R., Tamayo, P., Spiegelman, B., Lander, E.S., Hirschhorn, J.N., Altshuler, D. and Groop, L.C. (2003) PGC-1 alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics, 34, 267-273Rajagopalan, D. and Agarwal, P. (2005) Inferring pathways from gene lists using a literature-derived network of biological relationships. Bioinformatics, 21, 788-793.Stein, C. (1981) Estimating the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9, 1135-1151.Scha ̈fer, J. and Strimmer, K. (2005) A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics. Statistical Applications in Genetics and Molecular Biology, 4, 1.Segaran, T. and Hammerbacher, J. (2009) Beautiful Data:The Stories Behind Elegant Data Solutions. O’Reilly MediaShao, Y. and Cook, R.D. and Weisberg, S. (2007) Marginal tests with sliced average variance estimation. Biometrika, 94, 285-296.Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette,M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S. and Mesirov, J.P. (2005) Gene set enrichment analysis:a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102, 15545-15550Tian, L., Greenberg, S.A., Kong, S.W., Altschuler, J., Kohane, I.S. and Park, P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the United States of America, 102, 13544-13549Tsai C.A. and Chen, J.J. (2009) Multivariate analysis of variance test for gene set analysis. Bioinformatics, 25, 897-903.Weisberg, S. (2005) Applied Linear Regression, 3rd ed. New York:John Wiley.White, T. (2012) Hadoop:The Definitive Guide, 3rd ed. O’Reilly Media 描述 碩士
國立政治大學
統計研究所
100354029
101資料來源 http://thesis.lib.nccu.edu.tw/record/#G1003540292 資料類型 thesis dc.contributor.advisor 薛慧敏 zh_TW dc.contributor.advisor Hsueh, Hui Min en_US dc.contributor.author (Authors) 徐碩亨 zh_TW dc.contributor.author (Authors) Hsu, Shuo Heng en_US dc.creator (作者) 徐碩亨 zh_TW dc.creator (作者) Hsu, Shuo Heng en_US dc.date (日期) 2012 en_US dc.date.accessioned 11-Jul-2013 16:37:19 (UTC+8) - dc.date.available 11-Jul-2013 16:37:19 (UTC+8) - dc.date.issued (上傳時間) 11-Jul-2013 16:37:19 (UTC+8) - dc.identifier (Other Identifiers) G1003540292 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/58787 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計研究所 zh_TW dc.description (描述) 100354029 zh_TW dc.description (描述) 101 zh_TW dc.description.abstract (摘要) 隨著科技不斷的進步,人們需要處理的資料量也不斷地增加。在巨量資料的分析上,維度縮減將有助於增進效率。本篇論文主要介紹切片平均變異數估計維度縮減方法,並將此法應用於整體相關性檢定問題上。我們考慮切片平均變異數估計法中的邊際維度檢定,並將利用排列重抽法建構檢定統計量的虛無分配,藉此計算排列顯著值來獲得統計推論。此整體相關性檢定可用在基因組分析問題上,以驗證特定基因組與外顯特徵變數間的相關程度。最後我們將模擬本檢定的型一誤差率和檢定力,並與前人提出的方法做比較。 zh_TW dc.description.tableofcontents 目錄摘要 Ⅰ目錄 Ⅱ一、緒論 1二、維度縮減之簡介 41.縮減維度和中央子空間 42.切片平均變異數估計 63.整體相關之顯著性檢定 8三、模擬分析 11四、結論與建議 25參考文獻 27 zh_TW dc.format.extent 897061 bytes - dc.format.mimetype application/pdf - dc.language.iso en_US - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G1003540292 en_US dc.subject (關鍵詞) 維度縮減 zh_TW dc.subject (關鍵詞) 切片平均變異數估計法 zh_TW dc.subject (關鍵詞) 基因組分析 zh_TW dc.subject (關鍵詞) 排列顯著值 zh_TW dc.title (題名) 充分維度縮減於整體性檢定之應用 zh_TW dc.title (題名) Application of sufficient dimension reduction to global test en_US dc.type (資料類型) thesis en dc.relation.reference (參考文獻) Auer, P.L. and Doerge, R.W. (2011) A Two-Stage Poisson Model for Testing RNA-Seq Data. Statistical Applications in Genetics and Molecular Biology, 10, 1.Bura, E. and Pfeiffer, R.M. (2003) Graphical methods for class prediction using dimension reduction techniques on DNA microarray data. Bioinformatics, 19, 1252-1258.Chen, J.J., Lee, T., Delonggchamp, R.R., Chen, T. and Tsao, C.A. (2007) Significance analysis of groups of genes in expression profiling studies. Bioinformatics, 23, 2104-2112.Cook, R.D. (1996) Graphics for regression with a binary response. Journal of the American Statistical Association, 91, 983-992.Cook, R.D. (1998) Regression Graphics, Ideas for Studying Regressions Through Graphics. New York:John Wiley.Cook, R.D. (2000) Save:a method for dimension reduction and graphics in regression. Communications in Statistics-Theory and Methods, 29, 2109-2121.Cook, R.D. (2004) Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.Cook, R.D. and Lee, H. (1999) Dimension reduction in binary response regression. Journal of the American Statistical Association, 94, 1187-1200.Cook, R.D. and Weisberg, S. (1991) Comment. Journal of the American Statistical Association, 86, 328-332.Dinu, I., Potter, J.D., Mueller, T., Liu, Q., Adewale, A.J., Jhangri, G.S., Einecke, G., Famulski, K.S., Halloran, P. and Yasui, Y. (2007) Improving gene set analysis of microarray data by SAM-GS. Bioinformatics, 8, 242.Efron, B. and Tibshirani, R. (2007) On testing the significance of sets of genes. The Annals of Applied Statistics, 1, 107-129.Hosmer, D.W. and Cessie, S.L. and Lemeshow, S. (1997) A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine, 16, 965-980.Li, K.C. (1991) Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association, 86, 316-327.Li, K.C. (1992) On Principal Hessian Directions for Data Visualization and Dimension Reduction:Another Application of Stein’s Lemma. Journal of the American Statistical Association, 87, 1025-1039.Liu, Q., Dinu, I., Adewale, A., Potter, J. and Yasui, Y. (2007) Comparative evaluation of gene-set analysis methods. Bioinformatics, 8, 431.Mootha, V.K., Lindgren, C.M., Eriksson,K.F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M.J., Patterson, N., Mesirov, J.P., Golub, T.R., Tamayo, P., Spiegelman, B., Lander, E.S., Hirschhorn, J.N., Altshuler, D. and Groop, L.C. (2003) PGC-1 alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics, 34, 267-273Rajagopalan, D. and Agarwal, P. (2005) Inferring pathways from gene lists using a literature-derived network of biological relationships. Bioinformatics, 21, 788-793.Stein, C. (1981) Estimating the Mean of a Multivariate Normal Distribution. The Annals of Statistics, 9, 1135-1151.Scha ̈fer, J. and Strimmer, K. (2005) A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics. Statistical Applications in Genetics and Molecular Biology, 4, 1.Segaran, T. and Hammerbacher, J. (2009) Beautiful Data:The Stories Behind Elegant Data Solutions. O’Reilly MediaShao, Y. and Cook, R.D. and Weisberg, S. (2007) Marginal tests with sliced average variance estimation. Biometrika, 94, 285-296.Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette,M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S. and Mesirov, J.P. (2005) Gene set enrichment analysis:a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102, 15545-15550Tian, L., Greenberg, S.A., Kong, S.W., Altschuler, J., Kohane, I.S. and Park, P.J. (2005) Discovering statistically significant pathways in expression profiling studies. Proceedings of the National Academy of Sciences of the United States of America, 102, 13544-13549Tsai C.A. and Chen, J.J. (2009) Multivariate analysis of variance test for gene set analysis. Bioinformatics, 25, 897-903.Weisberg, S. (2005) Applied Linear Regression, 3rd ed. New York:John Wiley.White, T. (2012) Hadoop:The Definitive Guide, 3rd ed. O’Reilly Media zh_TW