利用孿生神經網路之特徵提取增強分類模型的表現

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	利用孿生神經網路之特徵提取增強分類模型的表現 Improving Classification Model Performance through Feature Extraction with Siamese Neural Network
作者	沈冠宇 Shen, Guan-Yu
貢獻者	周珮婷沈冠宇 Shen, Guan-Yu
關鍵詞	孿生神經網路特徵提取特徵學習維度縮減分類 Siamese Neural Network Feature Extraction Feature Learning Dimensionality Reduction Classification
日期	2024
上傳時間	5-Aug-2024 14:00:50 (UTC+8)
摘要	孿生神經網路是一種監督式的神經網路模型，用於學習有效衡量兩筆資料間的相似程度。其原理是將一對樣本同時輸入至兩個具有完全相同架構且共享權重的子網路中，並計算兩個子網路輸出向量間的相似度作為模型輸出，藉此判斷該樣本對為相似樣本或不相似樣本。本研究將孿生神經網路與分類任務進行結合，首先將資料輸入孿生神經網路進行訓練，並將其作為特徵提取器，提取子網路隱藏層的神經元輸出作為新特徵，再將這些新特徵放入目前最為熱門的分類演算法XGBoost中進行訓練，並以六組資料集進行驗證。結果顯示，利用提取的特徵進行模型訓練，其預測表現優於使用原始特徵之模型。此外，我們還嘗試先對提取出的特徵進行主成分分析後再進行訓練，以達成維度縮減的目的。結果顯示，在多數情況下，當累積解釋變異數比例達85%以上時，使用這些主成分進行訓練的預測表現比起使用原始特徵訓練來的更加優異。 Siamese neural network is a supervised learning model that learns to measure the similarity between a pair of data. During model training, a pair of samples is simultaneously input into two subnetworks that have identical configuration and weights. Then the similarity between output vectors are calculated to determine whether the sample pair is similar or not. This study integrates Siamese neural network with classification tasks by serving Siamese network as a feature extractor. Output vectors from the hidden layers of the subnetworks are extracted as new features, which are then trained using the classification algorithm XGBoost on six datasets. The results indicate that models trained with extracted features outperform those trained with original features. Additionally, we implement Principal Component Analysis on the extracted features before training to achieve dimensionality reduction. The findings suggest that, in most cases, when the cumulative explained variance ratio is above 85%, the predictive performance using these principal components is superior to that using the original features.
參考文獻	Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. Ahlawat, S. and Choudhary, A. (2020). Hybrid cnn-svm classifier for handwritten digit recognition. Procedia Computer Science, 167:2554–2560. International Conference on Computational Intelligence and Data Science. Aydemir, G., Paynabar, K., and Acar, B. (2022). Robust feature learning for remaining useful life estimation using siamese neural networks. In 2022 30th European Signal Processing Conference (EUSIPCO), pages 1432–1436. Benkaddour, M. and Bounoua, A. (2017). Feature extraction and classification using deep convolutional neural networks, pca and svc for face recognition. Traitement du signal, 34:77–91. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32. Bromley, J., Bentz, J. W., Bottou, L., Guyon, I. M., LeCun, Y., Moore, C., Säckinger, E., and Shah, R. (1993). Signature verification using a ”siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell., 7:669–688. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P., and Lukasik, S. (2012). Seeds. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5H30K. Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. ACM. Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol.1. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27. Durkota, K., Linda, M., Ludvik, M., and Tozicka, J. (2020). Neuron-net: Siamese network for anomaly detection. Technical report, DCASE2020 Challenge, Tech. Rep. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189 – 1232. Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 1735–1742. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9:1735–80. Jmila, H., Ibn Khedher, M., Blanc, G., and El Yacoubi, M. A. (2019). Siamese network based feature learning for improved intrusion detection. In Gedeon, T., Wong, K. W., and Lee, M., editors, Neural Information Processing, pages 377–389, Cham. Springer International Publishing. Koch, G., Zemel, R., Salakhutdinov, R., et al. (2015). Siamese neural networks for oneshot image recognition. In ICML deep learning workshop, volume 2. Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551. Moustakidis, S. and Karlsson, P. (2020). A novel feature extraction methodology using siamese convolutional neural networks for intrusion detection. Cybersecurity, 3. Mowforth, P. and Shepherd, B. (1992). Statlog (Vehicle Silhouettes). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5HG6N. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al. (2019). Kerastuner. https://github.com/keras-team/keras-tuner. Ozkan, I. A., Koklu, M., and Saraçoğlu, R. (2021). Classification of pistachio species using improved k-nn classifier. Progress in Nutrition, 23:e2021044. Pearson, K. (1901). Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572. Rumelhart, D. E. and McClelland, J. L. (1987). Learning Internal Representations by Error Propagation, pages 318–362. MIT Press. Sengupta, D., Ali, S. N., Bhattacharya, A., Mustafi, J., Mukhopadhyay, A., and Sengupta, K. (2022). A deep hybrid learning pipeline for accurate diagnosis of ovarian cancer based on nuclear morphology. PLOS ONE, 17(1):1–20. Siegler, R. (1994). Balance Scale. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5488X. Tsalera, E., Papadakis, A., Samarakou, M., and Voyiatzis, I. (2022). Feature extraction with handcrafted methods and convolutional neural networks for facial emotion recognition. Applied Sciences, 12:8455.
描述	碩士國立政治大學統計學系 111354031
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0111354031
資料類型	thesis

dc.contributor.advisor	周珮婷	zh_TW
dc.contributor.author (Authors)	沈冠宇	zh_TW
dc.contributor.author (Authors)	Shen, Guan-Yu	en_US
dc.creator (作者)	沈冠宇	zh_TW
dc.creator (作者)	Shen, Guan-Yu	en_US
dc.date (日期)	2024	en_US
dc.date.accessioned	5-Aug-2024 14:00:50 (UTC+8)	-
dc.date.available	5-Aug-2024 14:00:50 (UTC+8)	-
dc.date.issued (上傳時間)	5-Aug-2024 14:00:50 (UTC+8)	-
dc.identifier (Other Identifiers)	G0111354031	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/152783	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計學系	zh_TW
dc.description (描述)	111354031	zh_TW
dc.description.abstract (摘要)	孿生神經網路是一種監督式的神經網路模型，用於學習有效衡量兩筆資料間的相似程度。其原理是將一對樣本同時輸入至兩個具有完全相同架構且共享權重的子網路中，並計算兩個子網路輸出向量間的相似度作為模型輸出，藉此判斷該樣本對為相似樣本或不相似樣本。本研究將孿生神經網路與分類任務進行結合，首先將資料輸入孿生神經網路進行訓練，並將其作為特徵提取器，提取子網路隱藏層的神經元輸出作為新特徵，再將這些新特徵放入目前最為熱門的分類演算法XGBoost中進行訓練，並以六組資料集進行驗證。結果顯示，利用提取的特徵進行模型訓練，其預測表現優於使用原始特徵之模型。此外，我們還嘗試先對提取出的特徵進行主成分分析後再進行訓練，以達成維度縮減的目的。結果顯示，在多數情況下，當累積解釋變異數比例達85%以上時，使用這些主成分進行訓練的預測表現比起使用原始特徵訓練來的更加優異。	zh_TW
dc.description.abstract (摘要)	Siamese neural network is a supervised learning model that learns to measure the similarity between a pair of data. During model training, a pair of samples is simultaneously input into two subnetworks that have identical configuration and weights. Then the similarity between output vectors are calculated to determine whether the sample pair is similar or not. This study integrates Siamese neural network with classification tasks by serving Siamese network as a feature extractor. Output vectors from the hidden layers of the subnetworks are extracted as new features, which are then trained using the classification algorithm XGBoost on six datasets. The results indicate that models trained with extracted features outperform those trained with original features. Additionally, we implement Principal Component Analysis on the extracted features before training to achieve dimensionality reduction. The findings suggest that, in most cases, when the cumulative explained variance ratio is above 85%, the predictive performance using these principal components is superior to that using the original features.	en_US
dc.description.tableofcontents	第一章緒論 1 第二章文獻回顧 3 2.1 孿生神經網路 3 2.2 特徵提取 4 2.3 孿生神經網路之特徵提取 5 第三章研究方法 7 3.1 神經網路 7 3.2 孿生神經網路 8 3.2.1 孿生神經網路架構 8 3.2.2 對比損失函數 9 3.3 主成分分析 10 3.4 XGBoost 11 3.5 模型評估指標 12 3.6 模型訓練流程 13 3.6.1 成對資料準備 14 3.6.2 孿生神經網路的超參數調校 14 3.6.3 特徵提取 15 3.6.4 XGBoost 訓練 16 第四章研究結果 17 4.1 資料集介紹 17 4.2 模型預測表現 18 4.2.1 使用所有特徵的模型預測表現 18 4.2.2 對提取之特徵做主成分分析後的預測表現 20 第五章結論與建議 25 5.1 研究結論 25 5.2 未來方向與建議 26 參考文獻 27	zh_TW
dc.format.extent	1366283 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0111354031	en_US
dc.subject (關鍵詞)	孿生神經網路	zh_TW
dc.subject (關鍵詞)	特徵提取	zh_TW
dc.subject (關鍵詞)	特徵學習	zh_TW
dc.subject (關鍵詞)	維度縮減	zh_TW
dc.subject (關鍵詞)	分類	zh_TW
dc.subject (關鍵詞)	Siamese Neural Network	en_US
dc.subject (關鍵詞)	Feature Extraction	en_US
dc.subject (關鍵詞)	Feature Learning	en_US
dc.subject (關鍵詞)	Dimensionality Reduction	en_US
dc.subject (關鍵詞)	Classification	en_US
dc.title (題名)	利用孿生神經網路之特徵提取增強分類模型的表現	zh_TW
dc.title (題名)	Improving Classification Model Performance through Feature Extraction with Siamese Neural Network	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org. Ahlawat, S. and Choudhary, A. (2020). Hybrid cnn-svm classifier for handwritten digit recognition. Procedia Computer Science, 167:2554–2560. International Conference on Computational Intelligence and Data Science. Aydemir, G., Paynabar, K., and Acar, B. (2022). Robust feature learning for remaining useful life estimation using siamese neural networks. In 2022 30th European Signal Processing Conference (EUSIPCO), pages 1432–1436. Benkaddour, M. and Bounoua, A. (2017). Feature extraction and classification using deep convolutional neural networks, pca and svc for face recognition. Traitement du signal, 34:77–91. Breiman, L. (2001). Random forests. Machine Learning, 45:5–32. Bromley, J., Bentz, J. W., Bottou, L., Guyon, I. M., LeCun, Y., Moore, C., Säckinger, E., and Shah, R. (1993). Signature verification using a ”siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell., 7:669–688. Charytanowicz, M., Niewczas, J., Kulczycki, P., Kowalski, P., and Lukasik, S. (2012). Seeds. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5H30K. Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. ACM. Chopra, S., Hadsell, R., and LeCun, Y. (2005). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol.1. Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27. Durkota, K., Linda, M., Ludvik, M., and Tozicka, J. (2020). Neuron-net: Siamese network for anomaly detection. Technical report, DCASE2020 Challenge, Tech. Rep. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5):1189 – 1232. Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 1735–1742. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9:1735–80. Jmila, H., Ibn Khedher, M., Blanc, G., and El Yacoubi, M. A. (2019). Siamese network based feature learning for improved intrusion detection. In Gedeon, T., Wong, K. W., and Lee, M., editors, Neural Information Processing, pages 377–389, Cham. Springer International Publishing. Koch, G., Zemel, R., Salakhutdinov, R., et al. (2015). Siamese neural networks for oneshot image recognition. In ICML deep learning workshop, volume 2. Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551. Moustakidis, S. and Karlsson, P. (2020). A novel feature extraction methodology using siamese convolutional neural networks for intrusion detection. Cybersecurity, 3. Mowforth, P. and Shepherd, B. (1992). Statlog (Vehicle Silhouettes). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5HG6N. O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al. (2019). Kerastuner. https://github.com/keras-team/keras-tuner. Ozkan, I. A., Koklu, M., and Saraçoğlu, R. (2021). Classification of pistachio species using improved k-nn classifier. Progress in Nutrition, 23:e2021044. Pearson, K. (1901). Liii. on lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11):559–572. Rumelhart, D. E. and McClelland, J. L. (1987). Learning Internal Representations by Error Propagation, pages 318–362. MIT Press. Sengupta, D., Ali, S. N., Bhattacharya, A., Mustafi, J., Mukhopadhyay, A., and Sengupta, K. (2022). A deep hybrid learning pipeline for accurate diagnosis of ovarian cancer based on nuclear morphology. PLOS ONE, 17(1):1–20. Siegler, R. (1994). Balance Scale. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5488X. Tsalera, E., Papadakis, A., Samarakou, M., and Voyiatzis, I. (2022). Feature extraction with handcrafted methods and convolutional neural networks for facial emotion recognition. Applied Sciences, 12:8455.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM