Publications-Theses
Article View/Open
Publication Export
-
題名 自適型單層前饋式類神經網路的裁剪機制與主成分分析
The Pruning Mechanism of Adaptive Single-hidden Layer Neural Networks and Principal Component Analysis作者 孫紹傑
Sun, Shao-Chieh貢獻者 蔡瑞煌
Tsaih, Rua-Huan
孫紹傑
Sun, Shao-Chieh關鍵詞 主成分分析
強記、軟化、整合學習演算法
人工類神經網路
隱藏節點修剪
Principal Component Analysis
Cramming, Softening, and Integrating learning algorithm
Artificial Neural Network
Hidden Node Pruning日期 2020 上傳時間 2-Sep-2020 11:46:46 (UTC+8) 摘要 在機器學習領域中的人工類神經網絡(ANN)之架構中,為了解決神經網路學習演算法中過度擬合(overfitting)問題,截至目前尚未有任何系統化的機制可以來幫助我們有效的判別可丟棄的非相關隱藏節點(Irrelevant Hidden Nodes) 。為了解決上述挑戰,我們著重在建立一種系統化結合 PCA (主成分分析) 所提出的 PD(修剪檢測機制)機制,來可靠且有效的決斷出潛在非相關隱藏節點(Potential Irrelevant Hidden Nodes)。本研究所提出的ASLFNPD 運作機制具有以下特點:(1)採用單層隱藏層的神經網(ASLFN)和 ReLU 激活函數;(2)採用PCA 機制幫助辦別潛在非相關隱藏節點(potential irrelevant hidden nodes)。我們進行了實驗並記錄PCA 運作時所產生的 omega 參數數值以及相關資訊,用以驗證所提出的機制具有有效性和效率性。
In order to solve the overfitting problem in the neural network learning issue, there is no systematic mechanism to help us effectively identify Irrelevant Hidden Nodes. To address the above challenges, we focus on establishing a systematic PCA (Principal Component Analysis), PD (Pruning Detection) mechanism to reliably and effectively determine the potential irrelevant hidden nodes. The proposed mechanism ASLFNPD has the following characteristics: (1) applicable to the adaptive single-hidden layer feed-forward neural networks (ASLFN) with the ReLU activation function on all hidden nodes. (2) Use the PCA mechanism to help identify potential irrelevant hidden nodes. We conducted experiments and recorded the omega values generated by PCA and relevant information to verify the effectiveness and efficiency of the proposed mechanism.參考文獻 [1] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727.[2] Tsaih, R. R. (1998). An explanation of reasoning neural networks. Mathematical and Computer Modelling, 28(2), 37-44.[3] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.[4] Xue, Y. (2019, February). An Overview of Overfitting and its Solutions. In Journal of Physics: Conference Series (Vol. 1168, No. 2, p. 022022). IOP Publishing.[5] Tsaih, R. H., & Cheng, T. C. (2009). A resistant learning procedure for coping with outliers. Annals of Mathematics and Artificial Intelligence, 57(2), 161-180.[6] Chang, H.Y. (2019). The sequentially-learning-based algorithm and the prediction of the turning points of bull and bear markets (Master’s dissertation). National Chengchi University, 1-39.[7] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572.[8] Smith, L. I. (2002). A tutorial on principal components analysis.[9] Shlens, J. (2014). A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.[10] Hanna, A. J. (2018). A top-down approach to identifying bull and bear market states. International Review of Financial Analysis, 55, 93-110.[11] Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of applied econometrics, 18(1), 23-46.[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.[13] Talathi, S. S., & Vartak, A. (2015). Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771.[14] Tsaih, R. R. (1993). The softening learning procedure. Mathematical and computer modelling, 18(8), 61-64.[15] Allamy, H. (2014). Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Computer Science, Communication and Instrumentation Devices, Kochi, India (December 27, 2014).[16] Caruana, R., Lawrence, S., & Giles, C. L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in neural information processing systems (pp. 402-408).[17] Cawley, G. C. (2012, October). Over-Fitting in Model Selection and Its Avoidance. In IDA (p. 1).[18] Chauvin, Y. (1989). A back-propagation algorithm with optimal use of hidden units. In Advances in neural information processing systems (pp. 519-526).[19] Ishikawa, M. (1989). A structural learning algorithm with forgetting of link weights. In International 1989 Joint Conference on Neural Networks (pp. 626-vol). IEEE.[20] Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In Advances in neural information processing systems (pp. 875-882).[21] Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in neural information processing systems (pp. 950-957).[22] LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In Advances in neural information processing systems (pp. 598-605).[23] Srivastava, N. (2013). Improving neural networks with dropout. University of Toronto, 182(566), 7.[24] Jackson, J. E. (2005). A user`s guide to principal components (Vol. 587). John Wiley & Sons. (pp. 1-3)[25] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.[26] Fisher, R. A., & Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. The Journal of Agricultural Science, 13(3), 311-320.[27] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.[28] Tripathi, A. (2019), A Complete Guide to Principal Component Analysis – PCA in Machine earning. URL“https://towardsdatascience.com/a-complete-guide-to-principal-component-analysis-pca-in-machine-learning-664f34fc3e5a”[29] Jolliffe, I. T. (2002). Principal component analysis.[30] Xu, X., & Wen, C. (2017). Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis. Journal of Control Science and Engineering, 2017.[31] Kashani, M. N., Aminian, J., Shahhosseini, S., & Farrokhi, M. (2012). Dynamic crude oil fouling prediction in industrial preheaters using optimized ANN based moving window technique. Chemical Engineering Research and Design, 90(7), 938-949.[32] Chen, S. S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33(2), 211-223.[33] Chen, S. S. (2012). Revisiting the empirical linkages between stock returns and trading volume. Journal of Banking & Finance, 36(6), 1781-1788. 描述 碩士
國立政治大學
資訊管理學系
107356022資料來源 http://thesis.lib.nccu.edu.tw/record/#G0107356022 資料類型 thesis dc.contributor.advisor 蔡瑞煌 zh_TW dc.contributor.advisor Tsaih, Rua-Huan en_US dc.contributor.author (Authors) 孫紹傑 zh_TW dc.contributor.author (Authors) Sun, Shao-Chieh en_US dc.creator (作者) 孫紹傑 zh_TW dc.creator (作者) Sun, Shao-Chieh en_US dc.date (日期) 2020 en_US dc.date.accessioned 2-Sep-2020 11:46:46 (UTC+8) - dc.date.available 2-Sep-2020 11:46:46 (UTC+8) - dc.date.issued (上傳時間) 2-Sep-2020 11:46:46 (UTC+8) - dc.identifier (Other Identifiers) G0107356022 en_US dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/131496 - dc.description (描述) 碩士 zh_TW dc.description (描述) 國立政治大學 zh_TW dc.description (描述) 資訊管理學系 zh_TW dc.description (描述) 107356022 zh_TW dc.description.abstract (摘要) 在機器學習領域中的人工類神經網絡(ANN)之架構中,為了解決神經網路學習演算法中過度擬合(overfitting)問題,截至目前尚未有任何系統化的機制可以來幫助我們有效的判別可丟棄的非相關隱藏節點(Irrelevant Hidden Nodes) 。為了解決上述挑戰,我們著重在建立一種系統化結合 PCA (主成分分析) 所提出的 PD(修剪檢測機制)機制,來可靠且有效的決斷出潛在非相關隱藏節點(Potential Irrelevant Hidden Nodes)。本研究所提出的ASLFNPD 運作機制具有以下特點:(1)採用單層隱藏層的神經網(ASLFN)和 ReLU 激活函數;(2)採用PCA 機制幫助辦別潛在非相關隱藏節點(potential irrelevant hidden nodes)。我們進行了實驗並記錄PCA 運作時所產生的 omega 參數數值以及相關資訊,用以驗證所提出的機制具有有效性和效率性。 zh_TW dc.description.abstract (摘要) In order to solve the overfitting problem in the neural network learning issue, there is no systematic mechanism to help us effectively identify Irrelevant Hidden Nodes. To address the above challenges, we focus on establishing a systematic PCA (Principal Component Analysis), PD (Pruning Detection) mechanism to reliably and effectively determine the potential irrelevant hidden nodes. The proposed mechanism ASLFNPD has the following characteristics: (1) applicable to the adaptive single-hidden layer feed-forward neural networks (ASLFN) with the ReLU activation function on all hidden nodes. (2) Use the PCA mechanism to help identify potential irrelevant hidden nodes. We conducted experiments and recorded the omega values generated by PCA and relevant information to verify the effectiveness and efficiency of the proposed mechanism. en_US dc.description.tableofcontents 1. INTRODUCTION 82. LITERATURE REVIEW 112.1 RECTIFIED LINEAR UNIT (RELU) 112.2 SINGLE-HIDDEN LAYER FEED-FORWARD NEURAL NETWORKSWITH ONE OUTPUT NODE. 122.3 ADAPTIVE SINGLE-HIDDEN LAYER FEED-FORWARD NEURALNETWORKS (ASLFN) 142.4 THE CRAMMING, SOFTENING AND INTEGRATING LEARNINGALGORITHM 172.5 OVERFITTING 232.6 PRINCIPAL COMPONENT ANALYSIS 263. METHODOLOGY 294. EXPERIMENT DESIGN 345. EXPERIMENT RESULT 386. SUMMARY AND FUTURE WORK 47APPENDIX 49REFERENCE 58 zh_TW dc.format.extent 3081774 bytes - dc.format.mimetype application/pdf - dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0107356022 en_US dc.subject (關鍵詞) 主成分分析 zh_TW dc.subject (關鍵詞) 強記、軟化、整合學習演算法 zh_TW dc.subject (關鍵詞) 人工類神經網路 zh_TW dc.subject (關鍵詞) 隱藏節點修剪 zh_TW dc.subject (關鍵詞) Principal Component Analysis en_US dc.subject (關鍵詞) Cramming, Softening, and Integrating learning algorithm en_US dc.subject (關鍵詞) Artificial Neural Network en_US dc.subject (關鍵詞) Hidden Node Pruning en_US dc.title (題名) 自適型單層前饋式類神經網路的裁剪機制與主成分分析 zh_TW dc.title (題名) The Pruning Mechanism of Adaptive Single-hidden Layer Neural Networks and Principal Component Analysis en_US dc.type (資料類型) thesis en_US dc.relation.reference (參考文獻) [1] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727.[2] Tsaih, R. R. (1998). An explanation of reasoning neural networks. Mathematical and Computer Modelling, 28(2), 37-44.[3] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.[4] Xue, Y. (2019, February). An Overview of Overfitting and its Solutions. In Journal of Physics: Conference Series (Vol. 1168, No. 2, p. 022022). IOP Publishing.[5] Tsaih, R. H., & Cheng, T. C. (2009). A resistant learning procedure for coping with outliers. Annals of Mathematics and Artificial Intelligence, 57(2), 161-180.[6] Chang, H.Y. (2019). The sequentially-learning-based algorithm and the prediction of the turning points of bull and bear markets (Master’s dissertation). National Chengchi University, 1-39.[7] Pearson, K. (1901). LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559-572.[8] Smith, L. I. (2002). A tutorial on principal components analysis.[9] Shlens, J. (2014). A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.[10] Hanna, A. J. (2018). A top-down approach to identifying bull and bear market states. International Review of Financial Analysis, 55, 93-110.[11] Pagan, A. R., & Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of applied econometrics, 18(1), 23-46.[12] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), 1929-1958.[13] Talathi, S. S., & Vartak, A. (2015). Improving performance of recurrent neural network with relu nonlinearity. arXiv preprint arXiv:1511.03771.[14] Tsaih, R. R. (1993). The softening learning procedure. Mathematical and computer modelling, 18(8), 61-64.[15] Allamy, H. (2014). Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Computer Science, Communication and Instrumentation Devices, Kochi, India (December 27, 2014).[16] Caruana, R., Lawrence, S., & Giles, C. L. (2001). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in neural information processing systems (pp. 402-408).[17] Cawley, G. C. (2012, October). Over-Fitting in Model Selection and Its Avoidance. In IDA (p. 1).[18] Chauvin, Y. (1989). A back-propagation algorithm with optimal use of hidden units. In Advances in neural information processing systems (pp. 519-526).[19] Ishikawa, M. (1989). A structural learning algorithm with forgetting of link weights. In International 1989 Joint Conference on Neural Networks (pp. 626-vol). IEEE.[20] Weigend, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forecasting. In Advances in neural information processing systems (pp. 875-882).[21] Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in neural information processing systems (pp. 950-957).[22] LeCun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In Advances in neural information processing systems (pp. 598-605).[23] Srivastava, N. (2013). Improving neural networks with dropout. University of Toronto, 182(566), 7.[24] Jackson, J. E. (2005). A user`s guide to principal components (Vol. 587). John Wiley & Sons. (pp. 1-3)[25] Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6), 417.[26] Fisher, R. A., & Mackenzie, W. A. (1923). Studies in crop variation. II. The manurial response of different potato varieties. The Journal of Agricultural Science, 13(3), 311-320.[27] Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3), 37-52.[28] Tripathi, A. (2019), A Complete Guide to Principal Component Analysis – PCA in Machine earning. URL“https://towardsdatascience.com/a-complete-guide-to-principal-component-analysis-pca-in-machine-learning-664f34fc3e5a”[29] Jolliffe, I. T. (2002). Principal component analysis.[30] Xu, X., & Wen, C. (2017). Fault Diagnosis Method Based on Information Entropy and Relative Principal Component Analysis. Journal of Control Science and Engineering, 2017.[31] Kashani, M. N., Aminian, J., Shahhosseini, S., & Farrokhi, M. (2012). Dynamic crude oil fouling prediction in industrial preheaters using optimized ANN based moving window technique. Chemical Engineering Research and Design, 90(7), 938-949.[32] Chen, S. S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33(2), 211-223.[33] Chen, S. S. (2012). Revisiting the empirical linkages between stock returns and trading volume. Journal of Banking & Finance, 36(6), 1781-1788. zh_TW dc.identifier.doi (DOI) 10.6814/NCCU202001062 en_US