貝氏相關 t 檢定之改進及其在交叉驗證資料中的應用

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	貝氏相關 t 檢定之改進及其在交叉驗證資料中的應用 Enhancements to the Bayesian Correlated t-Test and Its Application to Cross-Validation Data
作者	許育菱 Xu, Yu-Ling
貢獻者	翁久幸 Weng, Chui-Hsing 許育菱 Xu, Yu-Ling
關鍵詞	交叉驗證準確率統計檢定 Correlated t-test Bayesian correlated t-test 影像增強圖像分類 Cross-validation Accuracy Statistical testing Correlated t-test Bayesian correlated t-test Image enhancement Image classification
日期	2024
上傳時間	4-Sep-2024 14:57:11 (UTC+8)
摘要	在統計學與機器學習任務中，交叉驗證(Cross-validation)是一種常見的方法，用於將原始數據集劃分為多個子集，使模型在不同的數據子集上反覆進行訓練和驗證。通過分析交叉驗證後產生的準確率資料，可以評估模型的效能和穩健性，亦能比較不同模型下的準確率差異。本研究針對交叉驗證資料進行統計檢定，使用了Correlated t-test、Bayesian correlated t-test，並提出修正Bayesian correlated t-test共變異數矩陣後的Bayesian correlated t-test2。使用模擬資料的研究結果顯示，Bayesian correlated t-test2在多數情況下表現優於Bayesian correlated t-test。而在結論上，Bayesian correlated t-test2與Correlated t-test十分相似，但是Bayesian correlated t-test2的優勢是能夠提供更多的額外資訊。此外，在實際資料分析上，本研究將Correlated t-test、Bayesian correlated t-test、Bayesian correlated t-test2用於比較影像增強方法對圖像分類表現的影響，發現在多數資料集中，Contrast Stretching處理後的分類結果較佳，而 Sharpening處理則相對較差。 In statistical and machine learning tasks, cross-validation is a common method used to divide the original dataset into multiple subsets, allowing the model to be trained and validated repeatedly on different subsets of the data. By analyzing the accuracy data generated from cross-validation, we can evaluate the model's performance and robustness, as well as compare the accuracy differences under different models. This study conducts a statistical testing of cross-validation data, utilizing the Correlated t-test and the Bayesian correlated t-test, and proposes the Bayesian correlated t-test2, which modifies the covariance matrix of the Bayesian correlated t-test. The results from simulated data show that the Bayesian correlated t-test2 outperforms the Bayesian correlated t-test in most cases. While the Bayesian correlated t-test2 is very similar to the Correlated t-test in conclusion, its advantage lies in providing additional information. Furthermore, in practical data analysis, this study applies the Correlated t-test, Bayesian correlated t-test, and Bayesian correlated t-test2 to compare the impact of image enhancement methods on image classification performance. It was found that, in most datasets, the classification results after Contrast Stretching treatment were better, while the results after Sharpening treatment were relatively poor.
參考文獻	Benavoli, A., Corani, G., Demšar, J., and Zaffalon, M. (2017). Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. Journal of Machine Learning Research, 18(77):1–36. Corani, G. and Benavoli, A. (2015). A bayesian approach for comparing cross-validated algorithms on multiple data sets. Machine Learning, 100(2):285–304. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1–30. Demšar, J. (2008). On the appropriateness of statistical tests in machine learning. In Workshop on Evaluation Methods for Machine Learning in Conjunction with ICML, page 65. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1923. Gonzalez, R. C. (2009). Digital image processing. Pearson Education India. Jain, A. (1989). Fundamentals of digital image processing. Prentice-Hall. Jeffreys, H. (1935). Some tests of significance, treated by the theory of probability. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 31, pages 203–222. Cambridge University Press. Jeffreys, H. (1998). The theory of probability. OUP Oxford. Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430):773–795. Kitchen, L. and Rosenfeld, A. (1982). Gray-level corner detection. Pattern Recognition Letters, 1(2):95–102. Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report, University of Toronto. Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press. Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2):573. Nadeau, C. and Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52(3):239–281.
描述	碩士國立政治大學統計學系 111354026
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0111354026
資料類型	thesis

dc.contributor.advisor	翁久幸	zh_TW
dc.contributor.advisor	Weng, Chui-Hsing	en_US
dc.contributor.author (Authors)	許育菱	zh_TW
dc.contributor.author (Authors)	Xu, Yu-Ling	en_US
dc.creator (作者)	許育菱	zh_TW
dc.creator (作者)	Xu, Yu-Ling	en_US
dc.date (日期)	2024	en_US
dc.date.accessioned	4-Sep-2024 14:57:11 (UTC+8)	-
dc.date.available	4-Sep-2024 14:57:11 (UTC+8)	-
dc.date.issued (上傳時間)	4-Sep-2024 14:57:11 (UTC+8)	-
dc.identifier (Other Identifiers)	G0111354026	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/153368	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計學系	zh_TW
dc.description (描述)	111354026	zh_TW
dc.description.abstract (摘要)	在統計學與機器學習任務中，交叉驗證(Cross-validation)是一種常見的方法，用於將原始數據集劃分為多個子集，使模型在不同的數據子集上反覆進行訓練和驗證。通過分析交叉驗證後產生的準確率資料，可以評估模型的效能和穩健性，亦能比較不同模型下的準確率差異。本研究針對交叉驗證資料進行統計檢定，使用了Correlated t-test、Bayesian correlated t-test，並提出修正Bayesian correlated t-test共變異數矩陣後的Bayesian correlated t-test2。使用模擬資料的研究結果顯示，Bayesian correlated t-test2在多數情況下表現優於Bayesian correlated t-test。而在結論上，Bayesian correlated t-test2與Correlated t-test十分相似，但是Bayesian correlated t-test2的優勢是能夠提供更多的額外資訊。此外，在實際資料分析上，本研究將Correlated t-test、Bayesian correlated t-test、Bayesian correlated t-test2用於比較影像增強方法對圖像分類表現的影響，發現在多數資料集中，Contrast Stretching處理後的分類結果較佳，而 Sharpening處理則相對較差。	zh_TW
dc.description.abstract (摘要)	In statistical and machine learning tasks, cross-validation is a common method used to divide the original dataset into multiple subsets, allowing the model to be trained and validated repeatedly on different subsets of the data. By analyzing the accuracy data generated from cross-validation, we can evaluate the model's performance and robustness, as well as compare the accuracy differences under different models. This study conducts a statistical testing of cross-validation data, utilizing the Correlated t-test and the Bayesian correlated t-test, and proposes the Bayesian correlated t-test2, which modifies the covariance matrix of the Bayesian correlated t-test. The results from simulated data show that the Bayesian correlated t-test2 outperforms the Bayesian correlated t-test in most cases. While the Bayesian correlated t-test2 is very similar to the Correlated t-test in conclusion, its advantage lies in providing additional information. Furthermore, in practical data analysis, this study applies the Correlated t-test, Bayesian correlated t-test, and Bayesian correlated t-test2 to compare the impact of image enhancement methods on image classification performance. It was found that, in most datasets, the classification results after Contrast Stretching treatment were better, while the results after Sharpening treatment were relatively poor.	en_US
dc.description.tableofcontents	第一章　緒論　1 第二章　文獻回顧　3 2.1　貝氏統計應用於統計檢定　3 2.2　統計檢定應用於機器學習　4 第三章　研究方法　5 3.1　交叉驗證資料　5 3.2　統計檢定方法　6 3.2.1　Correlated t-test　6 3.2.2　Bayesian correlated t-test　6 3.2.3　Bayesian correlated t-test2　8 3.2.4　ROPE　9 3.3　交叉熵　11 3.4　影像增強方法　12 3.4.1　Contrast Stretching　12 3.4.2　Histogram Equalization　15 3.4.3　Sharpening　17 第四章　模擬資料分析　20 4.1　資料生成　20 4.2　實驗一　22 4.3　實驗一結論　27 4.4　實驗二　28 4.5　實驗二結論　29 4.6　模擬資料分析結論　30 第五章　真實資料分析　31 5.1　資料介紹　31 5.1.1　CIFAR-100資料集　31 5.1.2　取得交叉驗證資料　33 5.2　檢定結果　35 第六章　結論與建議　39 6.1　結論與實際應用　39 6.2　限制與未來方向　39 參考文獻　41 附錄　43	zh_TW
dc.format.extent	2978122 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0111354026	en_US
dc.subject (關鍵詞)	交叉驗證	zh_TW
dc.subject (關鍵詞)	準確率	zh_TW
dc.subject (關鍵詞)	統計檢定	zh_TW
dc.subject (關鍵詞)	Correlated t-test	zh_TW
dc.subject (關鍵詞)	Bayesian correlated t-test	zh_TW
dc.subject (關鍵詞)	影像增強	zh_TW
dc.subject (關鍵詞)	圖像分類	zh_TW
dc.subject (關鍵詞)	Cross-validation	en_US
dc.subject (關鍵詞)	Accuracy	en_US
dc.subject (關鍵詞)	Statistical testing	en_US
dc.subject (關鍵詞)	Correlated t-test	en_US
dc.subject (關鍵詞)	Bayesian correlated t-test	en_US
dc.subject (關鍵詞)	Image enhancement	en_US
dc.subject (關鍵詞)	Image classification	en_US
dc.title (題名)	貝氏相關 t 檢定之改進及其在交叉驗證資料中的應用	zh_TW
dc.title (題名)	Enhancements to the Bayesian Correlated t-Test and Its Application to Cross-Validation Data	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Benavoli, A., Corani, G., Demšar, J., and Zaffalon, M. (2017). Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. Journal of Machine Learning Research, 18(77):1–36. Corani, G. and Benavoli, A. (2015). A bayesian approach for comparing cross-validated algorithms on multiple data sets. Machine Learning, 100(2):285–304. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1–30. Demšar, J. (2008). On the appropriateness of statistical tests in machine learning. In Workshop on Evaluation Methods for Machine Learning in Conjunction with ICML, page 65. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1923. Gonzalez, R. C. (2009). Digital image processing. Pearson Education India. Jain, A. (1989). Fundamentals of digital image processing. Prentice-Hall. Jeffreys, H. (1935). Some tests of significance, treated by the theory of probability. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 31, pages 203–222. Cambridge University Press. Jeffreys, H. (1998). The theory of probability. OUP Oxford. Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430):773–795. Kitchen, L. and Rosenfeld, A. (1982). Gray-level corner detection. Pattern Recognition Letters, 1(2):95–102. Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report, University of Toronto. Kruschke, J. (2014). Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press. Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2):573. Nadeau, C. and Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52(3):239–281.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM