結合主成分分析之神經元剪枝方法應用於優化孿生神經網路 | Publication

Publications-Theses

Article View/Open

pdf(0)

Publication Export

Google Scholar^TM

題名	結合主成分分析之神經元剪枝方法應用於優化孿生神經網路 An Optimization Approach for Siamese Neural Networks Using Principal Component Analysis-Based Neuron Pruning
作者	王冀鋼 Wang, Chi-Kang
貢獻者	周珮婷 Chou, Pei-Ting 王冀鋼 Wang, Chi-Kang
關鍵詞	孿生神經網路神經元剪枝主成分分析非結構化資料模型簡化分類 Siamese Neural Network Principal Component Analysis Neuron Pruning Unstructured Data Model Simplification Classification
日期	2025
上傳時間	1-Jul-2025 15:03:45 (UTC+8)
摘要	神經網路模型在各類應用中展現出強大的預測能力，但其超參數設定仍然是影響模型效能的重要挑戰，尤其是在神經元數量的選擇上。當神經元數量不足時，模型往往難以捕捉數據中的複雜模式，導致預測精度下降；反之，過多的神經元則會大幅增加參數規模和計算成本，同時可能引發過擬合等問題。針對此一困境，本研究提出一種基於主成分分析的神經元剪枝策略，旨在對預訓練神經網路模型中的神經元權重進行解析，並篩選出具有代表性的神經元。為驗證所提出方法的適用性與普遍性，本研究設計了一系列實驗，利用適合在少量資料環境的孿生神經網路，分別針對結構性資料與非結構型資料進行訓練和預測，並紀錄和比較不同神經元配置下的模型預測結果。最後結果顯示，經過此方法挑選神經元不僅有效減少模型參數，在較高的累積解釋變異數比例下，簡化後的模型預測表現甚至優於預訓練模型。 Neural network models have demonstrated strong predictive capabilities across a wide range of applications. However, the tuning of hyperparameters remains a critical challenge affecting model performance, particularly in determining the appropriate number of neurons. When the number of neurons is insufficient, the model often fails to capture the complex patterns inherent in the data, leading to reduced predictive accuracy. Conversely, an excessive number of neurons significantly increases the parameter scale and computational cost, and may also result in overfitting. To address this issue, this study proposes a neuron pruning strategy based on Principal Component Analysis(PCA), which aims to analyze the weights of neurons in a pre-trained neural network and identify a subset of representative neurons. To evaluate the applicability and generalizability of the proposed method, a series of experiments were conducted using Siamese Neural Networks(SNN), which are suitable for low-data scenarios. The experiments were performed on both structured and unstructured datasets, where models were trained and tested under various neuron configurations. The results show that the neuron selection method not only effectively reduces the number of model parameters, but also enables the simplified models to achieve predictive performance that surpasses that of the original pretrained models, particularly when a high cumulative explained variance ratio is retained.
參考文獻	Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Bengio, Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition (pp. 437-478). Berlin, Heidelberg: Springer Berlin Heidelberg. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. The journal of machine learning research, 13(1), 281-305. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1993). Signature verification using a " siamese " time delay neural network. Advances in neural information processing systems, 6. Cao, Z., Shaomin, M. U., Yongyu, X. U., & Dong, M. (2018, December). Image retrieval method based on CNN and dimension reduction. In 2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) (pp. 441-445). IEEE. Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 539-546). IEEE. Domhan, T., Springenberg, J. T., & Hutter, F. (2015, July). Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI (Vol. 15, pp. 3460-8). Gale, T., Elsen, E., & Hooker, S. (2019). The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574. Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28. Hassibi, B., & Stork, D. (1992). Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems, 5. Jolliffe, I. T. (2002). Principal component analysis for special types of data (pp. 338-372). Springer New York. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017, February). Learning curve prediction with Bayesian neural networks. In International conference on learning representations. Koch, G., Zemel, R., & Salakhutdinov, R. (2015, July). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2, No. 1, pp. 1-30). LeCun, Y., Denker, J., & Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2. Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through L_0 regularization. arXiv preprint arXiv:1712.01312. Molchanov, D., Ashukha, A., & Vetrov, D. (2017, July). Variational dropout sparsifies deep neural networks. In International conference on machine learning (pp. 2498-2507). PMLR. Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). Riera, M., Arnau, J. M., & González, A. (2022). DNN pruning with principal component analysis and connection importance estimation. Journal of Systems Architecture, 122, 102336. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823). Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75.
描述	碩士國立政治大學統計學系 112354032
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0112354032
資料類型	thesis

dc.contributor.advisor	周珮婷	zh_TW
dc.contributor.advisor	Chou, Pei-Ting	en_US
dc.contributor.author (Authors)	王冀鋼	zh_TW
dc.contributor.author (Authors)	Wang, Chi-Kang	en_US
dc.creator (作者)	王冀鋼	zh_TW
dc.creator (作者)	Wang, Chi-Kang	en_US
dc.date (日期)	2025	en_US
dc.date.accessioned	1-Jul-2025 15:03:45 (UTC+8)	-
dc.date.available	1-Jul-2025 15:03:45 (UTC+8)	-
dc.date.issued (上傳時間)	1-Jul-2025 15:03:45 (UTC+8)	-
dc.identifier (Other Identifiers)	G0112354032	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/157810	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	統計學系	zh_TW
dc.description (描述)	112354032	zh_TW
dc.description.abstract (摘要)	神經網路模型在各類應用中展現出強大的預測能力，但其超參數設定仍然是影響模型效能的重要挑戰，尤其是在神經元數量的選擇上。當神經元數量不足時，模型往往難以捕捉數據中的複雜模式，導致預測精度下降；反之，過多的神經元則會大幅增加參數規模和計算成本，同時可能引發過擬合等問題。針對此一困境，本研究提出一種基於主成分分析的神經元剪枝策略，旨在對預訓練神經網路模型中的神經元權重進行解析，並篩選出具有代表性的神經元。為驗證所提出方法的適用性與普遍性，本研究設計了一系列實驗，利用適合在少量資料環境的孿生神經網路，分別針對結構性資料與非結構型資料進行訓練和預測，並紀錄和比較不同神經元配置下的模型預測結果。最後結果顯示，經過此方法挑選神經元不僅有效減少模型參數，在較高的累積解釋變異數比例下，簡化後的模型預測表現甚至優於預訓練模型。	zh_TW
dc.description.abstract (摘要)	Neural network models have demonstrated strong predictive capabilities across a wide range of applications. However, the tuning of hyperparameters remains a critical challenge affecting model performance, particularly in determining the appropriate number of neurons. When the number of neurons is insufficient, the model often fails to capture the complex patterns inherent in the data, leading to reduced predictive accuracy. Conversely, an excessive number of neurons significantly increases the parameter scale and computational cost, and may also result in overfitting. To address this issue, this study proposes a neuron pruning strategy based on Principal Component Analysis(PCA), which aims to analyze the weights of neurons in a pre-trained neural network and identify a subset of representative neurons. To evaluate the applicability and generalizability of the proposed method, a series of experiments were conducted using Siamese Neural Networks(SNN), which are suitable for low-data scenarios. The experiments were performed on both structured and unstructured datasets, where models were trained and tested under various neuron configurations. The results show that the neuron selection method not only effectively reduces the number of model parameters, but also enables the simplified models to achieve predictive performance that surpasses that of the original pretrained models, particularly when a high cumulative explained variance ratio is retained.	en_US
dc.description.tableofcontents	第一章緒論 1 第二章文獻探討 2 2-1 孿生神經網路 2 2-2 神經元剪枝 3 2-3 總結 4 第三章研究方法 5 3-1 神經網路 5 3-2 孿生神經網路 7 3-3 主成分分析 8 3-4 實驗流程 9 3-4-1 樣本處理 10 3-4-2 模型訓練 11 3-4-3 神經元篩選 11 3-4-4 模型微調 12 3-4-5 模型評估 13 第四章實驗結果 15 4-1 資料集介紹 15 4-2 模型預測表現 15 第五章結論與建議 24 5-1 結論 24 5-2 未來方向與建議 24 參考文獻 26	zh_TW
dc.format.extent	1972616 bytes	-
dc.format.mimetype	application/pdf	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0112354032	en_US
dc.subject (關鍵詞)	孿生神經網路	zh_TW
dc.subject (關鍵詞)	神經元剪枝	zh_TW
dc.subject (關鍵詞)	主成分分析	zh_TW
dc.subject (關鍵詞)	非結構化資料	zh_TW
dc.subject (關鍵詞)	模型簡化	zh_TW
dc.subject (關鍵詞)	分類	zh_TW
dc.subject (關鍵詞)	Siamese Neural Network	en_US
dc.subject (關鍵詞)	Principal Component Analysis	en_US
dc.subject (關鍵詞)	Neuron Pruning	en_US
dc.subject (關鍵詞)	Unstructured Data	en_US
dc.subject (關鍵詞)	Model Simplification	en_US
dc.subject (關鍵詞)	Classification	en_US
dc.title (題名)	結合主成分分析之神經元剪枝方法應用於優化孿生神經網路	zh_TW
dc.title (題名)	An Optimization Approach for Siamese Neural Networks Using Principal Component Analysis-Based Neuron Pruning	en_US
dc.type (資料類型)	thesis	en_US
dc.relation.reference (參考文獻)	Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Bengio, Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition (pp. 437-478). Berlin, Heidelberg: Springer Berlin Heidelberg. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. The journal of machine learning research, 13(1), 281-305. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1993). Signature verification using a " siamese " time delay neural network. Advances in neural information processing systems, 6. Cao, Z., Shaomin, M. U., Yongyu, X. U., & Dong, M. (2018, December). Image retrieval method based on CNN and dimension reduction. In 2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) (pp. 441-445). IEEE. Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 539-546). IEEE. Domhan, T., Springenberg, J. T., & Hutter, F. (2015, July). Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI (Vol. 15, pp. 3460-8). Gale, T., Elsen, E., & Hooker, S. (2019). The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574. Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28. Hassibi, B., & Stork, D. (1992). Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems, 5. Jolliffe, I. T. (2002). Principal component analysis for special types of data (pp. 338-372). Springer New York. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017, February). Learning curve prediction with Bayesian neural networks. In International conference on learning representations. Koch, G., Zemel, R., & Salakhutdinov, R. (2015, July). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2, No. 1, pp. 1-30). LeCun, Y., Denker, J., & Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2. Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through L_0 regularization. arXiv preprint arXiv:1712.01312. Molchanov, D., Ashukha, A., & Vetrov, D. (2017, July). Variational dropout sparsifies deep neural networks. In International conference on machine learning (pp. 2498-2507). PMLR. Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). Riera, M., Arnau, J. M., & González, A. (2022). DNN pruning with principal component analysis and connection importance estimation. Journal of Systems Architecture, 122, 102336. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823). Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM