學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 自適型單層前饋式類神經網路的裁剪機制與主成分分析
The Pruning Mechanism of Adaptive Single-hidden Layer Neural Networks and Principal Component Analysis
作者 孫紹傑
Sun, Shao-Chieh
貢獻者 蔡瑞煌
Tsaih, Rua-Huan
孫紹傑
Sun, Shao-Chieh
關鍵詞 主成分分析
強記、軟化、整合學習演算法
人工類神經網路
隱藏節點修剪
Principal Component Analysis
Cramming, Softening, and Integrating learning algorithm
Artificial Neural Network
Hidden Node Pruning
日期 2020
上傳時間 2-Sep-2020 11:46:46 (UTC+8)
摘要 在機器學習領域中的人工類神經網絡(ANN)之架構中,為了解決神經網路學習演算法中過度擬合(overfitting)問題,截至目前尚未有任何系統化的機制可以來幫助我們有效的判別可丟棄的非相關隱藏節點(Irrelevant Hidden Nodes) 。為了解決上述挑戰,我們著重在建立一種系統化結合 PCA (主成分分析) 所提出的 PD(修剪檢測機制)機制,來可靠且有效的決斷出潛在非相關隱藏節點(Potential Irrelevant Hidden Nodes)。本研究所提出的ASLFNPD 運作機制具有以下特點:(1)採用單層隱藏層的神經網(ASLFN)和 ReLU 激活函數;(2)採用PCA 機制幫助辦別潛在非相關隱藏節點(potential irrelevant hidden nodes)。我們進行了實驗並記錄PCA 運作時所產生的 omega 參數數值以及相關資訊,用以驗證所提出的機制具有有效性和效率性。
In order to solve the overfitting problem in the neural network learning issue, there is no systematic mechanism to help us effectively identify Irrelevant Hidden Nodes. To address the above challenges, we focus on establishing a systematic PCA (Principal Component Analysis), PD (Pruning Detection) mechanism to reliably and effectively determine the potential irrelevant hidden nodes. The proposed mechanism ASLFNPD has the following characteristics: (1) applicable to the adaptive single-hidden layer feed-forward neural networks (ASLFN) with the ReLU activation function on all hidden nodes. (2) Use the PCA mechanism to help identify potential irrelevant hidden nodes. We conducted experiments and recorded the omega values generated by PCA and relevant information to verify the effectiveness and efficiency of the proposed mechanism.
參考文獻 [1] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727.
[2] Tsaih, R. R. (1998). An explanation of reasoning neural networks. Mathematical and Computer Modelling, 28(2), 37-44.
[3] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
[4] Xue, Y. (2019, February). An Overview of Overfitting and its Solutions. In Journal of Physics: Conference Series (Vol. 1168, No. 2, p. 022022). IOP Publishing.
[5] Tsaih, R. H., & Cheng, T. C. (2009). A resistant learning procedure for coping with outliers. Annals of Mathematics and Artificial Intelligence, 57(2), 161-180.
[6] Chang, H.Y. (2019). The sequentially-learning-based algorithm and the prediction of the turning points of bull and bear markets (Master’s dissertation). National Chengchi University, 1-39.
[7] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572.
[8] Smith, L. I. (2002). A tutorial on principal components analysis.
[9] Shlens, J. (2014). A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.
[10] Hanna, A. J. (2018). A top-down approach to identifying bull and bear market states. International Review of Financial Analysis, 55, 93-110.
[11] Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of applied econometrics, 18(1), 23-46.
[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.
[13] Talathi, S. S., & Vartak, A. (2015). Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771.
[14] Tsaih, R. R. (1993). The softening learning procedure. Mathematical and computer modelling, 18(8), 61-64.
[15] Allamy, H. (2014). Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Computer Science, Communication and Instrumentation Devices, Kochi, India (December 27, 2014).
[16] Caruana, R., Lawrence, S., & Giles, C. L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in neural information processing systems (pp. 402-408).
[17] Cawley, G. C. (2012, October). Over-Fitting in Model Selection and Its Avoidance. In IDA (p. 1).
[18] Chauvin, Y. (1989). A back-propagation algorithm with optimal use of hidden units. In Advances in neural information processing systems (pp. 519-526).
[19] Ishikawa, M. (1989). A structural learning algorithm with forgetting of link weights. In International 1989 Joint Conference on Neural Networks (pp. 626-vol). IEEE.
[20] Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In Advances in neural information processing systems (pp. 875-882).
[21] Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in neural information processing systems (pp. 950-957).
[22] LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In Advances in neural information processing systems (pp. 598-605).
[23] Srivastava, N. (2013). Improving neural networks with dropout. University of Toronto, 182(566), 7.
[24] Jackson, J. E. (2005). A user`s guide to principal components (Vol. 587). John Wiley & Sons. (pp. 1-3)
[25] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.
[26] Fisher, R. A., & Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. The Journal of Agricultural Science, 13(3), 311-320.
[27] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.
[28] Tripathi, A. (2019), A Complete Guide to Principal Component Analysis – PCA in Machine earning. URL“https://towardsdatascience.com/a-complete-guide-to-
principal-component-analysis-pca-in-machine-learning-664f34fc3e5a”
[29] Jolliffe, I. T. (2002). Principal component analysis.
[30] Xu, X., & Wen, C. (2017). Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis. Journal of Control Science and Engineering, 2017.
[31] Kashani, M. N., Aminian, J., Shahhosseini, S., & Farrokhi, M. (2012). Dynamic crude oil fouling prediction in industrial preheaters using optimized ANN based moving window technique. Chemical Engineering Research and Design, 90(7), 938-949.
[32] Chen, S. S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33(2), 211-223.
[33] Chen, S. S. (2012). Revisiting the empirical linkages between stock returns and trading volume. Journal of Banking & Finance, 36(6), 1781-1788.
描述 碩士
國立政治大學
資訊管理學系
107356022
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107356022
資料類型 thesis
dc.contributor.advisor 蔡瑞煌zh_TW
dc.contributor.advisor Tsaih, Rua-Huanen_US
dc.contributor.author (Authors) 孫紹傑zh_TW
dc.contributor.author (Authors) Sun, Shao-Chiehen_US
dc.creator (作者) 孫紹傑zh_TW
dc.creator (作者) Sun, Shao-Chiehen_US
dc.date (日期) 2020en_US
dc.date.accessioned 2-Sep-2020 11:46:46 (UTC+8)-
dc.date.available 2-Sep-2020 11:46:46 (UTC+8)-
dc.date.issued (上傳時間) 2-Sep-2020 11:46:46 (UTC+8)-
dc.identifier (Other Identifiers) G0107356022en_US
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/131496-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理學系zh_TW
dc.description (描述) 107356022zh_TW
dc.description.abstract (摘要) 在機器學習領域中的人工類神經網絡(ANN)之架構中,為了解決神經網路學習演算法中過度擬合(overfitting)問題,截至目前尚未有任何系統化的機制可以來幫助我們有效的判別可丟棄的非相關隱藏節點(Irrelevant Hidden Nodes) 。為了解決上述挑戰,我們著重在建立一種系統化結合 PCA (主成分分析) 所提出的 PD(修剪檢測機制)機制,來可靠且有效的決斷出潛在非相關隱藏節點(Potential Irrelevant Hidden Nodes)。本研究所提出的ASLFNPD 運作機制具有以下特點:(1)採用單層隱藏層的神經網(ASLFN)和 ReLU 激活函數;(2)採用PCA 機制幫助辦別潛在非相關隱藏節點(potential irrelevant hidden nodes)。我們進行了實驗並記錄PCA 運作時所產生的 omega 參數數值以及相關資訊,用以驗證所提出的機制具有有效性和效率性。zh_TW
dc.description.abstract (摘要) In order to solve the overfitting problem in the neural network learning issue, there is no systematic mechanism to help us effectively identify Irrelevant Hidden Nodes. To address the above challenges, we focus on establishing a systematic PCA (Principal Component Analysis), PD (Pruning Detection) mechanism to reliably and effectively determine the potential irrelevant hidden nodes. The proposed mechanism ASLFNPD has the following characteristics: (1) applicable to the adaptive single-hidden layer feed-forward neural networks (ASLFN) with the ReLU activation function on all hidden nodes. (2) Use the PCA mechanism to help identify potential irrelevant hidden nodes. We conducted experiments and recorded the omega values generated by PCA and relevant information to verify the effectiveness and efficiency of the proposed mechanism.en_US
dc.description.tableofcontents 1. INTRODUCTION 8
2. LITERATURE REVIEW 11
2.1 RECTIFIED LINEAR UNIT (RELU) 11
2.2 SINGLE-HIDDEN LAYER FEED-FORWARD NEURAL NETWORKS
WITH ONE OUTPUT NODE. 12
2.3 ADAPTIVE SINGLE-HIDDEN LAYER FEED-FORWARD NEURAL
NETWORKS (ASLFN) 14
2.4 THE CRAMMING, SOFTENING AND INTEGRATING LEARNING
ALGORITHM 17
2.5 OVERFITTING 23
2.6 PRINCIPAL COMPONENT ANALYSIS 26
3. METHODOLOGY 29
4. EXPERIMENT DESIGN 34
5. EXPERIMENT RESULT 38
6. SUMMARY AND FUTURE WORK 47
APPENDIX 49
REFERENCE 58
zh_TW
dc.format.extent 3081774 bytes-
dc.format.mimetype application/pdf-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107356022en_US
dc.subject (關鍵詞) 主成分分析zh_TW
dc.subject (關鍵詞) 強記、軟化、整合學習演算法zh_TW
dc.subject (關鍵詞) 人工類神經網路zh_TW
dc.subject (關鍵詞) 隱藏節點修剪zh_TW
dc.subject (關鍵詞) Principal Component Analysisen_US
dc.subject (關鍵詞) Cramming, Softening, and Integrating learning algorithmen_US
dc.subject (關鍵詞) Artificial Neural Networken_US
dc.subject (關鍵詞) Hidden Node Pruningen_US
dc.title (題名) 自適型單層前饋式類神經網路的裁剪機制與主成分分析zh_TW
dc.title (題名) The Pruning Mechanism of Adaptive Single-hidden Layer Neural Networks and Principal Component Analysisen_US
dc.type (資料類型) thesisen_US
dc.relation.reference (參考文獻) [1] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727.
[2] Tsaih, R. R. (1998). An explanation of reasoning neural networks. Mathematical and Computer Modelling, 28(2), 37-44.
[3] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
[4] Xue, Y. (2019, February). An Overview of Overfitting and its Solutions. In Journal of Physics: Conference Series (Vol. 1168, No. 2, p. 022022). IOP Publishing.
[5] Tsaih, R. H., & Cheng, T. C. (2009). A resistant learning procedure for coping with outliers. Annals of Mathematics and Artificial Intelligence, 57(2), 161-180.
[6] Chang, H.Y. (2019). The sequentially-learning-based algorithm and the prediction of the turning points of bull and bear markets (Master’s dissertation). National Chengchi University, 1-39.
[7] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572.
[8] Smith, L. I. (2002). A tutorial on principal components analysis.
[9] Shlens, J. (2014). A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.
[10] Hanna, A. J. (2018). A top-down approach to identifying bull and bear market states. International Review of Financial Analysis, 55, 93-110.
[11] Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of applied econometrics, 18(1), 23-46.
[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.
[13] Talathi, S. S., & Vartak, A. (2015). Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771.
[14] Tsaih, R. R. (1993). The softening learning procedure. Mathematical and computer modelling, 18(8), 61-64.
[15] Allamy, H. (2014). Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Computer Science, Communication and Instrumentation Devices, Kochi, India (December 27, 2014).
[16] Caruana, R., Lawrence, S., & Giles, C. L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in neural information processing systems (pp. 402-408).
[17] Cawley, G. C. (2012, October). Over-Fitting in Model Selection and Its Avoidance. In IDA (p. 1).
[18] Chauvin, Y. (1989). A back-propagation algorithm with optimal use of hidden units. In Advances in neural information processing systems (pp. 519-526).
[19] Ishikawa, M. (1989). A structural learning algorithm with forgetting of link weights. In International 1989 Joint Conference on Neural Networks (pp. 626-vol). IEEE.
[20] Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In Advances in neural information processing systems (pp. 875-882).
[21] Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in neural information processing systems (pp. 950-957).
[22] LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In Advances in neural information processing systems (pp. 598-605).
[23] Srivastava, N. (2013). Improving neural networks with dropout. University of Toronto, 182(566), 7.
[24] Jackson, J. E. (2005). A user`s guide to principal components (Vol. 587). John Wiley & Sons. (pp. 1-3)
[25] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.
[26] Fisher, R. A., & Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. The Journal of Agricultural Science, 13(3), 311-320.
[27] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.
[28] Tripathi, A. (2019), A Complete Guide to Principal Component Analysis – PCA in Machine earning. URL“https://towardsdatascience.com/a-complete-guide-to-
principal-component-analysis-pca-in-machine-learning-664f34fc3e5a”
[29] Jolliffe, I. T. (2002). Principal component analysis.
[30] Xu, X., & Wen, C. (2017). Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis. Journal of Control Science and Engineering, 2017.
[31] Kashani, M. N., Aminian, J., Shahhosseini, S., & Farrokhi, M. (2012). Dynamic crude oil fouling prediction in industrial preheaters using optimized ANN based moving window technique. Chemical Engineering Research and Design, 90(7), 938-949.
[32] Chen, S. S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33(2), 211-223.
[33] Chen, S. S. (2012). Revisiting the empirical linkages between stock returns and trading volume. Journal of Banking & Finance, 36(6), 1781-1788.
zh_TW
dc.identifier.doi (DOI) 10.6814/NCCU202001062en_US