相似性指數與卡方檢定之探討

Publications-Theses

Article View/Open

html(295)

Publication Export

Google Scholar^TM

題名	相似性指數與卡方檢定之探討
作者	歐陽致平
貢獻者	余清祥歐陽致平
關鍵詞	卡方檢定適合度檢定相似性指數電腦模擬
日期	2007
上傳時間	6-May-2016 16:35:56 (UTC+8)
摘要	適合度檢定(Goodness-of-fit Test)用於檢測觀察值是否符合某種特質，是統計學應用非常廣泛的檢定，其中卡方檢定(Chi-Squared Test)更是適合度檢定最常用的方法。卡方檢定廣受歡迎的原因之一在於其彈性，通常只要求分組後每一組觀察值的期望個數不少於5，若樣本較少需考慮併組，但如何併組至今仍無定論。本文即針對樣本數不足時，運用計算模擬的方法探討卡方適合度檢定，希冀研究結果可提供卡方適合度檢定併組的參考；另外，相似性指數(Similarity Indices)一般用於比較兩個母體的異同，使用上並不受限於觀察值期望個數的限制，我們也同時探討相似指數是否也可用於適合度檢定。過去研究顯示當母體接近均勻分配時，適合度檢定或有較為不同特性，因此我們將研究分成當母體服從（或接近）均勻分配、或是幾何分配兩種情形。當母體接近均勻分配時，我們發現卡方適合度檢定並不受限於期望個數不大於5的限制，不考慮併組的卡方檢定的型一誤差符合顯著水準的要求，而且比併組的卡方檢定有更大的檢力(Power)；然而，在母體服從幾何分配時，卡方檢定必須依賴併組以改善型一誤差。另外，我們也發現相似性指數確實在各種假設條件之下，檢定力皆不如併組修正的卡方檢定優越。
參考文獻	[1] Cochran, W.G. (1952), “The χ2 Test of Goodness of fit.”, Ann. Math. Statist., 23, 315-342. [2] Good, P.I. (1999), Resampling Methods – A Practical Guide to Data Analysis., Birkhauser, Boston. [3] Cramér, H. (1946), Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ. [4] Greenwood, P.E. and Nikulin, M.S. (1996), A Guide to Chi-Squared Testing., Wiley, New York. [5] Haber, M. (1980), “A Comparison of Some Continuity Corrections for the Chi-Square test on Tables.”, J. Amer. Statist. Assoc., 75(373), 510-515. [6] Haberman, S.J. (1988), “A Warning on the Use of Chi-Squared Statistics With Frequency Tables With Small Expected Cell Counts.”, J. Amer. Statist. Assoc. 83, 555-560. [7] Haldane, J.B.S. (1937), “The Exact Value of the Moments of the Distribution of χ2, Used As a Test of Goodness of Fit.”, Biometrika, 29, 133-143. [8] Kolmogorov, A.N. (1933), “Sulla determinazione empirica di una legge di distribuzione.”, Giorn. Inst. Ital. Attuari, 4, 83-91. [9] Kullback S. (1959), Information Theory and Statistics., Wiley, New York. [10] Pearson, K. (1922), “On the χ2 Test of Goodness of fit.”, Biometrica, 14, 186-191. [11] Pearson, K. (1932), “Experimental Discussion of the χ2 test for Goodness-of-fit.”, Biometrika, 24, 351-381. [12] Roscoe, J.T. and Byars, J.A. (1971), “An Investigation of the Restraints with respect to Sample Size Commonly Imposed on the Use of the Chi-Squared Statistics.”, J. Amer. Statist. Assoc., 66, 755-759 [13] Rubinstein, R.Y. (1981), Simulation and Monte Carlo Method., Wiley, New York. [14] Smirnov, N.V. (1944), “An Approximation to the Distribution Laws of Random Quantiles Determined by Empirical Data.” Uspehi Mat. Nauk, 10, 179-206. [15] Smith, W., Solow, A.R. and Preston, P.E. (1996), “An Estimator of Species Overlap Using a Modified Beta-binomial Model.”, Biometrics, 52, 1472-1477. [16] Yarnold, J.K. (1970), “The Minimum Expectation in χ2 Goodness of fit Tests and the Accuracy of Approximations for the Null Distribution.”, J. Amer. Statist. Assoc., 65, 864-886. [17] Yue, C.J. and Clayton, M.K. (2001), “A Nonparametric Estimator of Species Overlap”, Biometrics, 57(3), 743-749. [18] Yue, C.J. and Clayton, M.K. (2005). “Similarity Measures Based on Species Proportions”, Communications in Statistics: Theory and Methods.
描述	碩士國立政治大學統計學系 94354008
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0094354008
資料類型	thesis

dc.contributor.advisor	余清祥	zh_TW
dc.contributor.author (Authors)	歐陽致平	zh_TW
dc.creator (作者)	歐陽致平	zh_TW
dc.date (日期)	2007	en_US
dc.date.accessioned	6-May-2016 16:35:56 (UTC+8)	-
dc.date.available	6-May-2016 16:35:56 (UTC+8)	-
dc.date.issued (上傳時間)	6-May-2016 16:35:56 (UTC+8)	-
dc.identifier (Other Identifiers)	G0094354008	en_US
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/94412	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計學系	zh_TW
dc.description (描述)	94354008	zh_TW
dc.description.abstract (摘要)	適合度檢定(Goodness-of-fit Test)用於檢測觀察值是否符合某種特質，是統計學應用非常廣泛的檢定，其中卡方檢定(Chi-Squared Test)更是適合度檢定最常用的方法。卡方檢定廣受歡迎的原因之一在於其彈性，通常只要求分組後每一組觀察值的期望個數不少於5，若樣本較少需考慮併組，但如何併組至今仍無定論。本文即針對樣本數不足時，運用計算模擬的方法探討卡方適合度檢定，希冀研究結果可提供卡方適合度檢定併組的參考；另外，相似性指數(Similarity Indices)一般用於比較兩個母體的異同，使用上並不受限於觀察值期望個數的限制，我們也同時探討相似指數是否也可用於適合度檢定。過去研究顯示當母體接近均勻分配時，適合度檢定或有較為不同特性，因此我們將研究分成當母體服從（或接近）均勻分配、或是幾何分配兩種情形。當母體接近均勻分配時，我們發現卡方適合度檢定並不受限於期望個數不大於5的限制，不考慮併組的卡方檢定的型一誤差符合顯著水準的要求，而且比併組的卡方檢定有更大的檢力(Power)；然而，在母體服從幾何分配時，卡方檢定必須依賴併組以改善型一誤差。另外，我們也發現相似性指數確實在各種假設條件之下，檢定力皆不如併組修正的卡方檢定優越。	zh_TW
dc.description.tableofcontents	謝辭 I 摘要 II 目錄 III 圖目錄 V 表目錄 VII 第一章緒論 1 第一節研究動機 1 第二節研究目的與方法 3 第二章卡方檢定的歷史及相關研究結果 4 第一節卡方檢定的相關文獻探討 4 第二節卡方檢定併組的必要性 5 第三節均勻分配中的卡方檢定 5 第三章非均勻分配中的卡方檢定 10 第一節幾何分配中的卡方檢定 10 第二節卡方檢定的併組方式 16 第四章相似性指數 22 第一節均勻分配中的相似性指數 23 第二節幾何分配中的相似性指數 25 第五章二維度空間中的卡方檢定與相似性指數 29 第一節二維卡方檢定的併組方式 29 第二節二維均勻分配中的卡方檢定 30 第三節二維非均勻分配中的卡方檢定 31 第四節二維資料的相似性指數 36 第五節兩種適合度方法的比較 39 第六章兩個樣本下的卡方適合度檢定 41 第七章結論與建議 46 第一節結論 46 第二節建議與未來研究方向 47 參考文獻 48 附錄 50 附錄一二樣本卡方檢定，母體為幾何分配(alpha=0.5、k=10)時的型一誤差。 50 附錄二二樣本卡方檢定，母體為幾何分配(alpha=0.5、k=2)時的型一誤差。 51 附錄三二樣本卡方檢定，母體為幾何分配(alpha=0.5、k=3)時的型一誤差。 52	zh_TW
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0094354008	en_US
dc.subject (關鍵詞)	卡方檢定	zh_TW
dc.subject (關鍵詞)	適合度檢定	zh_TW
dc.subject (關鍵詞)	相似性指數	zh_TW
dc.subject (關鍵詞)	電腦模擬	zh_TW
dc.title (題名)	相似性指數與卡方檢定之探討	zh_TW
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	[1] Cochran, W.G. (1952), “The χ2 Test of Goodness of fit.”, Ann. Math. Statist., 23, 315-342. [2] Good, P.I. (1999), Resampling Methods – A Practical Guide to Data Analysis., Birkhauser, Boston. [3] Cramér, H. (1946), Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ. [4] Greenwood, P.E. and Nikulin, M.S. (1996), A Guide to Chi-Squared Testing., Wiley, New York. [5] Haber, M. (1980), “A Comparison of Some Continuity Corrections for the Chi-Square test on Tables.”, J. Amer. Statist. Assoc., 75(373), 510-515. [6] Haberman, S.J. (1988), “A Warning on the Use of Chi-Squared Statistics With Frequency Tables With Small Expected Cell Counts.”, J. Amer. Statist. Assoc. 83, 555-560. [7] Haldane, J.B.S. (1937), “The Exact Value of the Moments of the Distribution of χ2, Used As a Test of Goodness of Fit.”, Biometrika, 29, 133-143. [8] Kolmogorov, A.N. (1933), “Sulla determinazione empirica di una legge di distribuzione.”, Giorn. Inst. Ital. Attuari, 4, 83-91. [9] Kullback S. (1959), Information Theory and Statistics., Wiley, New York. [10] Pearson, K. (1922), “On the χ2 Test of Goodness of fit.”, Biometrica, 14, 186-191. [11] Pearson, K. (1932), “Experimental Discussion of the χ2 test for Goodness-of-fit.”, Biometrika, 24, 351-381. [12] Roscoe, J.T. and Byars, J.A. (1971), “An Investigation of the Restraints with respect to Sample Size Commonly Imposed on the Use of the Chi-Squared Statistics.”, J. Amer. Statist. Assoc., 66, 755-759 [13] Rubinstein, R.Y. (1981), Simulation and Monte Carlo Method., Wiley, New York. [14] Smirnov, N.V. (1944), “An Approximation to the Distribution Laws of Random Quantiles Determined by Empirical Data.” Uspehi Mat. Nauk, 10, 179-206. [15] Smith, W., Solow, A.R. and Preston, P.E. (1996), “An Estimator of Species Overlap Using a Modified Beta-binomial Model.”, Biometrics, 52, 1472-1477. [16] Yarnold, J.K. (1970), “The Minimum Expectation in χ2 Goodness of fit Tests and the Accuracy of Approximations for the Null Distribution.”, J. Amer. Statist. Assoc., 65, 864-886. [17] Yue, C.J. and Clayton, M.K. (2001), “A Nonparametric Estimator of Species Overlap”, Biometrics, 57(3), 743-749. [18] Yue, C.J. and Clayton, M.K. (2005). “Similarity Measures Based on Species Proportions”, Communications in Statistics: Theory and Methods.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM