Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 利用孿生神經網路之特徵提取增強分類模型的表現
Improving Classification Model Performance through Feature Extraction with Siamese Neural Network作者 沈冠宇
Shen, Guan-Yu貢獻者 周珮婷
沈冠宇
Shen, Guan-Yu關鍵詞 孿生神經網路
特徵提取
特徵學習
維度縮減
分類
Siamese Neural Network
Feature Extraction
Feature Learning
Dimensionality Reduction
Classification日期 2024 上傳時間 5-Aug-2024 14:00:50 (UTC+8) 摘要 孿生神經網路是一種監督式的神經網路模型,用於學習有效衡量兩筆資料間的相似程度。其原理是將一對樣本同時輸入至兩個具有完全相同架構且共享權重的子網路中,並計算兩個子網路輸出向量間的相似度作為模型輸出,藉此判斷該樣本對為相似樣本或不相似樣本。本研究將孿生神經網路與分類任務進行結合,首先將資料輸入孿生神經網路進行訓練,並將其作為特徵提取器,提取子網路隱藏層的神經元輸出作為新特徵,再將這些新特徵放入目前最為熱門的分類演算法XGBoost中進行訓練,並以六組資料集進行驗證。結果顯示,利用提取的特徵進行模型訓練,其預測表現優於使用原始特徵之模型。此外,我們還嘗試先對提取出的特徵進行主成分分析後再進行訓練,以達成維度縮減的目的。結果顯示,在多數情況下,當累積解釋變異數比例達85%以上時,使用這些主成分進行訓練的預測表現比起使用原始特徵訓練來的更加優異。
Siamese neural network is a supervised learning model that learns to measure the similarity between a pair of data. During model training, a pair of samples is simultaneously input into two subnetworks that have identical configuration and weights. Then the similarity between output vectors are calculated to determine whether the sample pair is similar or not. This study integrates Siamese neural network with classification tasks by serving Siamese network as a feature extractor. Output vectors from the hidden layers of the subnetworks are extracted as new features, which are then trained using the classification algorithm XGBoost on six datasets. The results indicate that models trained with extracted features outperform those trained with original features. Additionally, we implement Principal Component Analysis on the extracted features before training to achieve dimensionality reduction. The findings suggest that, in most cases, when the cumulative explained variance ratio is above 85%, the predictive performance using these principal components is superior to that using the original features.參考文獻 Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. Ahlawat, S. and Choudhary, A. (2020). Hybrid cnn-svm classifier for handwritten digit recognition. Procedia Computer Science, 167:2554–2560. International Conference on Computational Intelligence and Data Science. Aydemir, G., Paynabar, K., and Acar, B. (2022). Robust feature learning for remaining useful life estimation using siamese neural networks. In 2022 30th European Signal Processing Conference (EUSIPCO), pages 1432–1436. Benkaddour, M. and Bounoua, A. (2017). Feature extraction and classification using deep convolutional neural networks, pca and svc for face recognition. Traitement du signal, 34:77–91. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32. Bromley, J., Bentz, J. W., Bottou, L., Guyon, I. M., LeCun, Y., Moore, C., Säckinger, E., and Shah, R. (1993). Signature verification using a ”siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell., 7:669–688. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P., and Lukasik, S. (2012). Seeds. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5H30K. Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. ACM. Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol.1. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27. Durkota, K., Linda, M., Ludvik, M., and Tozicka, J. (2020). Neuron-net: Siamese network for anomaly detection. Technical report, DCASE2020 Challenge, Tech. Rep. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189 – 1232. Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 1735–1742. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9:1735–80. Jmila, H., Ibn Khedher, M., Blanc, G., and El Yacoubi, M. A. (2019). Siamese network based feature learning for improved intrusion detection. In Gedeon, T., Wong, K. W., and Lee, M., editors, Neural Information Processing, pages 377–389, Cham. Springer International Publishing. Koch, G., Zemel, R., Salakhutdinov, R., et al. (2015). Siamese neural networks for oneshot image recognition. In ICML deep learning workshop, volume 2. Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551. Moustakidis, S. and Karlsson, P. (2020). A novel feature extraction methodology using siamese convolutional neural networks for intrusion detection. Cybersecurity, 3. Mowforth, P. and Shepherd, B. (1992). Statlog (Vehicle Silhouettes). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5HG6N. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al. (2019). Kerastuner. https://github.com/keras-team/keras-tuner. Ozkan, I. A., Koklu, M., and Saraçoğlu, R. (2021). Classification of pistachio species using improved k-nn classifier. Progress in Nutrition, 23:e2021044. Pearson, K. (1901). Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572. Rumelhart, D. E. and McClelland, J. L. (1987). Learning Internal Representations by Error Propagation, pages 318–362. MIT Press. Sengupta, D., Ali, S. N., Bhattacharya, A., Mustafi, J., Mukhopadhyay, A., and Sengupta, K. (2022). A deep hybrid learning pipeline for accurate diagnosis of ovarian cancer based on nuclear morphology. PLOS ONE, 17(1):1–20. Siegler, R. (1994). Balance Scale. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5488X. Tsalera, E., Papadakis, A., Samarakou, M., and Voyiatzis, I. (2022). Feature extraction with handcrafted methods and convolutional neural networks for facial emotion recognition. Applied Sciences, 12:8455. 描述 碩士
國立政治大學
統計學系
111354031資料來源 http://thesis.lib.nccu.edu.tw/record/#G0111354031 資料類型 thesis dc.contributor.advisor 周珮婷 zh_TW dc.contributor.author (Authors) 沈冠宇 zh_TW dc.contributor.author (Authors) Shen, Guan-Yu en_US dc.creator (作者) 沈冠宇 zh_TW dc.creator (作者) Shen, Guan-Yu en_US dc.date (日期) 2024 en_US dc.date.accessioned 5-Aug-2024 14:00:50 (UTC+8) - dc.date.available 5-Aug-2024 14:00:50 (UTC+8) - dc.date.issued (上傳時間) 5-Aug-2024 14:00:50 (UTC+8) - dc.identifier (Other Identifiers) G0111354031 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/152783 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計學系 zh_TW dc.description (描述) 111354031 zh_TW dc.description.abstract (摘要) 孿生神經網路是一種監督式的神經網路模型,用於學習有效衡量兩筆資料間的相似程度。其原理是將一對樣本同時輸入至兩個具有完全相同架構且共享權重的子網路中,並計算兩個子網路輸出向量間的相似度作為模型輸出,藉此判斷該樣本對為相似樣本或不相似樣本。本研究將孿生神經網路與分類任務進行結合,首先將資料輸入孿生神經網路進行訓練,並將其作為特徵提取器,提取子網路隱藏層的神經元輸出作為新特徵,再將這些新特徵放入目前最為熱門的分類演算法XGBoost中進行訓練,並以六組資料集進行驗證。結果顯示,利用提取的特徵進行模型訓練,其預測表現優於使用原始特徵之模型。此外,我們還嘗試先對提取出的特徵進行主成分分析後再進行訓練,以達成維度縮減的目的。結果顯示,在多數情況下,當累積解釋變異數比例達85%以上時,使用這些主成分進行訓練的預測表現比起使用原始特徵訓練來的更加優異。 zh_TW dc.description.abstract (摘要) Siamese neural network is a supervised learning model that learns to measure the similarity between a pair of data. During model training, a pair of samples is simultaneously input into two subnetworks that have identical configuration and weights. Then the similarity between output vectors are calculated to determine whether the sample pair is similar or not. This study integrates Siamese neural network with classification tasks by serving Siamese network as a feature extractor. Output vectors from the hidden layers of the subnetworks are extracted as new features, which are then trained using the classification algorithm XGBoost on six datasets. The results indicate that models trained with extracted features outperform those trained with original features. Additionally, we implement Principal Component Analysis on the extracted features before training to achieve dimensionality reduction. The findings suggest that, in most cases, when the cumulative explained variance ratio is above 85%, the predictive performance using these principal components is superior to that using the original features. en_US dc.description.tableofcontents 第一章 緒論 1 第二章 文獻回顧 3 2.1 孿生神經網路 3 2.2 特徵提取 4 2.3 孿生神經網路之特徵提取 5 第三章 研究方法 7 3.1 神經網路 7 3.2 孿生神經網路 8 3.2.1 孿生神經網路架構 8 3.2.2 對比損失函數 9 3.3 主成分分析 10 3.4 XGBoost 11 3.5 模型評估指標 12 3.6 模型訓練流程 13 3.6.1 成對資料準備 14 3.6.2 孿生神經網路的超參數調校 14 3.6.3 特徵提取 15 3.6.4 XGBoost 訓練 16 第四章 研究結果 17 4.1 資料集介紹 17 4.2 模型預測表現 18 4.2.1 使用所有特徵的模型預測表現 18 4.2.2 對提取之特徵做主成分分析後的預測表現 20 第五章 結論與建議 25 5.1 研究結論 25 5.2 未來方向與建議 26 參考文獻 27 zh_TW dc.format.extent 1366283 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0111354031 en_US dc.subject (關鍵詞) 孿生神經網路 zh_TW dc.subject (關鍵詞) 特徵提取 zh_TW dc.subject (關鍵詞) 特徵學習 zh_TW dc.subject (關鍵詞) 維度縮減 zh_TW dc.subject (關鍵詞) 分類 zh_TW dc.subject (關鍵詞) Siamese Neural Network en_US dc.subject (關鍵詞) Feature Extraction en_US dc.subject (關鍵詞) Feature Learning en_US dc.subject (關鍵詞) Dimensionality Reduction en_US dc.subject (關鍵詞) Classification en_US dc.title (題名) 利用孿生神經網路之特徵提取增強分類模型的表現 zh_TW dc.title (題名) Improving Classification Model Performance through Feature Extraction with Siamese Neural Network en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. Ahlawat, S. and Choudhary, A. (2020). Hybrid cnn-svm classifier for handwritten digit recognition. Procedia Computer Science, 167:2554–2560. International Conference on Computational Intelligence and Data Science. Aydemir, G., Paynabar, K., and Acar, B. (2022). Robust feature learning for remaining useful life estimation using siamese neural networks. In 2022 30th European Signal Processing Conference (EUSIPCO), pages 1432–1436. Benkaddour, M. and Bounoua, A. (2017). Feature extraction and classification using deep convolutional neural networks, pca and svc for face recognition. Traitement du signal, 34:77–91. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32. Bromley, J., Bentz, J. W., Bottou, L., Guyon, I. M., LeCun, Y., Moore, C., Säckinger, E., and Shah, R. (1993). Signature verification using a ”siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell., 7:669–688. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P., and Lukasik, S. (2012). Seeds. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5H30K. Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. ACM. Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol.1. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27. Durkota, K., Linda, M., Ludvik, M., and Tozicka, J. (2020). Neuron-net: Siamese network for anomaly detection. Technical report, DCASE2020 Challenge, Tech. Rep. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189 – 1232. Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 1735–1742. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9:1735–80. Jmila, H., Ibn Khedher, M., Blanc, G., and El Yacoubi, M. A. (2019). Siamese network based feature learning for improved intrusion detection. In Gedeon, T., Wong, K. W., and Lee, M., editors, Neural Information Processing, pages 377–389, Cham. Springer International Publishing. Koch, G., Zemel, R., Salakhutdinov, R., et al. (2015). Siamese neural networks for oneshot image recognition. In ICML deep learning workshop, volume 2. Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551. Moustakidis, S. and Karlsson, P. (2020). A novel feature extraction methodology using siamese convolutional neural networks for intrusion detection. Cybersecurity, 3. Mowforth, P. and Shepherd, B. (1992). Statlog (Vehicle Silhouettes). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5HG6N. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al. (2019). Kerastuner. https://github.com/keras-team/keras-tuner. Ozkan, I. A., Koklu, M., and Saraçoğlu, R. (2021). Classification of pistachio species using improved k-nn classifier. Progress in Nutrition, 23:e2021044. Pearson, K. (1901). Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572. Rumelhart, D. E. and McClelland, J. L. (1987). Learning Internal Representations by Error Propagation, pages 318–362. MIT Press. Sengupta, D., Ali, S. N., Bhattacharya, A., Mustafi, J., Mukhopadhyay, A., and Sengupta, K. (2022). A deep hybrid learning pipeline for accurate diagnosis of ovarian cancer based on nuclear morphology. PLOS ONE, 17(1):1–20. Siegler, R. (1994). Balance Scale. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5488X. Tsalera, E., Papadakis, A., Samarakou, M., and Voyiatzis, I. (2022). Feature extraction with handcrafted methods and convolutional neural networks for facial emotion recognition. Applied Sciences, 12:8455. zh_TW