Publications-Theses
Article View/Open
Publication Export
-
Google ScholarTM
NCCU Library
Citation Infomation
Related Publications in TAIR
題名 結合主成分分析之神經元剪枝方法應用於優化孿生神經網路
An Optimization Approach for Siamese Neural Networks Using Principal Component Analysis-Based Neuron Pruning作者 王冀鋼
Wang, Chi-Kang貢獻者 周珮婷
Chou, Pei-Ting
王冀鋼
Wang, Chi-Kang關鍵詞 孿生神經網路
神經元剪枝
主成分分析
非結構化資料
模型簡化
分類
Siamese Neural Network
Principal Component Analysis
Neuron Pruning
Unstructured Data
Model Simplification
Classification日期 2025 上傳時間 1-Jul-2025 15:03:45 (UTC+8) 摘要 神經網路模型在各類應用中展現出強大的預測能力,但其超參數設定仍然是影響模型效能的重要挑戰,尤其是在神經元數量的選擇上。當神經元數量不足時,模型往往難以捕捉數據中的複雜模式,導致預測精度下降;反之,過多的神經元則會大幅增加參數規模和計算成本,同時可能引發過擬合等問題。針對此一困境,本研究提出一種基於主成分分析的神經元剪枝策略,旨在對預訓練神經網路模型中的神經元權重進行解析,並篩選出具有代表性的神經元。為驗證所提出方法的適用性與普遍性,本研究設計了一系列實驗,利用適合在少量資料環境的孿生神經網路,分別針對結構性資料與非結構型資料進行訓練和預測,並紀錄和比較不同神經元配置下的模型預測結果。最後結果顯示,經過此方法挑選神經元不僅有效減少模型參數,在較高的累積解釋變異數比例下,簡化後的模型預測表現甚至優於預訓練模型。
Neural network models have demonstrated strong predictive capabilities across a wide range of applications. However, the tuning of hyperparameters remains a critical challenge affecting model performance, particularly in determining the appropriate number of neurons. When the number of neurons is insufficient, the model often fails to capture the complex patterns inherent in the data, leading to reduced predictive accuracy. Conversely, an excessive number of neurons significantly increases the parameter scale and computational cost, and may also result in overfitting. To address this issue, this study proposes a neuron pruning strategy based on Principal Component Analysis(PCA), which aims to analyze the weights of neurons in a pre-trained neural network and identify a subset of representative neurons. To evaluate the applicability and generalizability of the proposed method, a series of experiments were conducted using Siamese Neural Networks(SNN), which are suitable for low-data scenarios. The experiments were performed on both structured and unstructured datasets, where models were trained and tested under various neuron configurations. The results show that the neuron selection method not only effectively reduces the number of model parameters, but also enables the simplified models to achieve predictive performance that surpasses that of the original pretrained models, particularly when a high cumulative explained variance ratio is retained.參考文獻 Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Bengio, Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition (pp. 437-478). Berlin, Heidelberg: Springer Berlin Heidelberg. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. The journal of machine learning research, 13(1), 281-305. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1993). Signature verification using a " siamese " time delay neural network. Advances in neural information processing systems, 6. Cao, Z., Shaomin, M. U., Yongyu, X. U., & Dong, M. (2018, December). Image retrieval method based on CNN and dimension reduction. In 2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) (pp. 441-445). IEEE. Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 539-546). IEEE. Domhan, T., Springenberg, J. T., & Hutter, F. (2015, July). Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI (Vol. 15, pp. 3460-8). Gale, T., Elsen, E., & Hooker, S. (2019). The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574. Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28. Hassibi, B., & Stork, D. (1992). Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems, 5. Jolliffe, I. T. (2002). Principal component analysis for special types of data (pp. 338-372). Springer New York. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017, February). Learning curve prediction with Bayesian neural networks. In International conference on learning representations. Koch, G., Zemel, R., & Salakhutdinov, R. (2015, July). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2, No. 1, pp. 1-30). LeCun, Y., Denker, J., & Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2. Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through L_0 regularization. arXiv preprint arXiv:1712.01312. Molchanov, D., Ashukha, A., & Vetrov, D. (2017, July). Variational dropout sparsifies deep neural networks. In International conference on machine learning (pp. 2498-2507). PMLR. Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). Riera, M., Arnau, J. M., & González, A. (2022). DNN pruning with principal component analysis and connection importance estimation. Journal of Systems Architecture, 122, 102336. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823). Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75. 描述 碩士
國立政治大學
統計學系
112354032資料來源 http://thesis.lib.nccu.edu.tw/record/#G0112354032 資料類型 thesis dc.contributor.advisor 周珮婷 zh_TW dc.contributor.advisor Chou, Pei-Ting en_US dc.contributor.author (Authors) 王冀鋼 zh_TW dc.contributor.author (Authors) Wang, Chi-Kang en_US dc.creator (作者) 王冀鋼 zh_TW dc.creator (作者) Wang, Chi-Kang en_US dc.date (日期) 2025 en_US dc.date.accessioned 1-Jul-2025 15:03:45 (UTC+8) - dc.date.available 1-Jul-2025 15:03:45 (UTC+8) - dc.date.issued (上傳時間) 1-Jul-2025 15:03:45 (UTC+8) - dc.identifier (Other Identifiers) G0112354032 en_US dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/157810 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 統計學系 zh_TW dc.description (描述) 112354032 zh_TW dc.description.abstract (摘要) 神經網路模型在各類應用中展現出強大的預測能力,但其超參數設定仍然是影響模型效能的重要挑戰,尤其是在神經元數量的選擇上。當神經元數量不足時,模型往往難以捕捉數據中的複雜模式,導致預測精度下降;反之,過多的神經元則會大幅增加參數規模和計算成本,同時可能引發過擬合等問題。針對此一困境,本研究提出一種基於主成分分析的神經元剪枝策略,旨在對預訓練神經網路模型中的神經元權重進行解析,並篩選出具有代表性的神經元。為驗證所提出方法的適用性與普遍性,本研究設計了一系列實驗,利用適合在少量資料環境的孿生神經網路,分別針對結構性資料與非結構型資料進行訓練和預測,並紀錄和比較不同神經元配置下的模型預測結果。最後結果顯示,經過此方法挑選神經元不僅有效減少模型參數,在較高的累積解釋變異數比例下,簡化後的模型預測表現甚至優於預訓練模型。 zh_TW dc.description.abstract (摘要) Neural network models have demonstrated strong predictive capabilities across a wide range of applications. However, the tuning of hyperparameters remains a critical challenge affecting model performance, particularly in determining the appropriate number of neurons. When the number of neurons is insufficient, the model often fails to capture the complex patterns inherent in the data, leading to reduced predictive accuracy. Conversely, an excessive number of neurons significantly increases the parameter scale and computational cost, and may also result in overfitting. To address this issue, this study proposes a neuron pruning strategy based on Principal Component Analysis(PCA), which aims to analyze the weights of neurons in a pre-trained neural network and identify a subset of representative neurons. To evaluate the applicability and generalizability of the proposed method, a series of experiments were conducted using Siamese Neural Networks(SNN), which are suitable for low-data scenarios. The experiments were performed on both structured and unstructured datasets, where models were trained and tested under various neuron configurations. The results show that the neuron selection method not only effectively reduces the number of model parameters, but also enables the simplified models to achieve predictive performance that surpasses that of the original pretrained models, particularly when a high cumulative explained variance ratio is retained. en_US dc.description.tableofcontents 第一章 緒論 1 第二章 文獻探討 2 2-1 孿生神經網路 2 2-2 神經元剪枝 3 2-3 總結 4 第三章 研究方法 5 3-1 神經網路 5 3-2 孿生神經網路 7 3-3 主成分分析 8 3-4 實驗流程 9 3-4-1 樣本處理 10 3-4-2 模型訓練 11 3-4-3 神經元篩選 11 3-4-4 模型微調 12 3-4-5 模型評估 13 第四章 實驗結果 15 4-1 資料集介紹 15 4-2 模型預測表現 15 第五章 結論與建議 24 5-1 結論 24 5-2 未來方向與建議 24 參考文獻 26 zh_TW dc.format.extent 1972616 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0112354032 en_US dc.subject (關鍵詞) 孿生神經網路 zh_TW dc.subject (關鍵詞) 神經元剪枝 zh_TW dc.subject (關鍵詞) 主成分分析 zh_TW dc.subject (關鍵詞) 非結構化資料 zh_TW dc.subject (關鍵詞) 模型簡化 zh_TW dc.subject (關鍵詞) 分類 zh_TW dc.subject (關鍵詞) Siamese Neural Network en_US dc.subject (關鍵詞) Principal Component Analysis en_US dc.subject (關鍵詞) Neuron Pruning en_US dc.subject (關鍵詞) Unstructured Data en_US dc.subject (關鍵詞) Model Simplification en_US dc.subject (關鍵詞) Classification en_US dc.title (題名) 結合主成分分析之神經元剪枝方法應用於優化孿生神經網路 zh_TW dc.title (題名) An Optimization Approach for Siamese Neural Networks Using Principal Component Analysis-Based Neuron Pruning en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Bengio, Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition (pp. 437-478). Berlin, Heidelberg: Springer Berlin Heidelberg. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. The journal of machine learning research, 13(1), 281-305. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1993). Signature verification using a " siamese " time delay neural network. Advances in neural information processing systems, 6. Cao, Z., Shaomin, M. U., Yongyu, X. U., & Dong, M. (2018, December). Image retrieval method based on CNN and dimension reduction. In 2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) (pp. 441-445). IEEE. Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 539-546). IEEE. Domhan, T., Springenberg, J. T., & Hutter, F. (2015, July). Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI (Vol. 15, pp. 3460-8). Gale, T., Elsen, E., & Hooker, S. (2019). The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574. Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28. Hassibi, B., & Stork, D. (1992). Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems, 5. Jolliffe, I. T. (2002). Principal component analysis for special types of data (pp. 338-372). Springer New York. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017, February). Learning curve prediction with Bayesian neural networks. In International conference on learning representations. Koch, G., Zemel, R., & Salakhutdinov, R. (2015, July). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2, No. 1, pp. 1-30). LeCun, Y., Denker, J., & Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2. Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through L_0 regularization. arXiv preprint arXiv:1712.01312. Molchanov, D., Ashukha, A., & Vetrov, D. (2017, July). Variational dropout sparsifies deep neural networks. In International conference on machine learning (pp. 2498-2507). PMLR. Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). Riera, M., Arnau, J. M., & González, A. (2022). DNN pruning with principal component analysis and connection importance estimation. Journal of Systems Architecture, 122, 102336. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823). Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75. zh_TW
