學術產出-學位論文

題名 The Rule Extraction from Multi-layer Feed-forward Neural Networks
作者 柯文乾
Ke, Wen-Chyan
貢獻者 蔡瑞煌
Tsahi Ray
柯文乾
Ke, Wen-Chyan
關鍵詞 知識萃取
規則萃取
法則萃取
債券評價
knowledge extraction
rule extraction
bond-pricing
日期 2002
上傳時間 18-九月-2009 14:22:32 (UTC+8)
摘要 神經網路已經被成功地應用於解決各種分類及函數近似的問題,尤其因為神經網路是個萬能的近似器(universal approximator),所以對於函數近似的問題效果更為顯著。以往對於此類問題雖然多數以線性的分析工具為主,但是實際上多數問題本質上是非線性的,所以對於非線性分析工具的需求其實是很大的。自1986年起,神經網路本身的運作一直被視為一個黑箱作業,難以判斷網路學習結果的合理性,更無法有效地幫助使用者增進其知識,因此提供一套合理及有效的神經網路分析方法是重要。
本文提出一套分析神網路系統的方法;利用線性規劃的技巧萃取及分析網路中的規則(rule),而不需要對任何資料集做分析;進而利用統計無母數方法-符號檢定-歸納出網路中的知識。以債券評價為例,驗證此方法的可行性,實證結果亦顯示此方法所萃取出來的規則是合理的,且由這些萃取出的規則中,所歸納出來有關債券評價的知識多數是合理的。
Neural networks have been successfully applied to solve a variety of application problems including classification and function approximation. They are especially useful for function approximation problems because they have been shown to be uni-versal approximators. In the past, for function approximation problems, they were mainly analyzed via tools of linear analyses. However, most of the function approxi-mation problems needed tools of nonlinear analyses in fact. Thus, there is the much demand for tools of nonlinear analyses. Since 1986, the neural network is considered a black box. It is hard to determine if the learning result of a neural network is rea-sonable, and the network can not effectively help users to develop the domain knowl-edge. Thus, it is important to supply a reasonable and effective analytic method of the neural network.
Here, we propose an analytic method of the neural network. It can extract rules from the neural network and analyze them via the Linear Programming and does not depend on any data analysis. Then we can generalize domain knowledge from these rules via the sign test, a statistical non-parameter method. We take the bond-pricing as an instance to examine the feasibility of our proposed method. The result shows that these extracted rules are reasonable by our method and that these generalized domain knowledge from these rules is also reasonable.
參考文獻 Bishop, C. M. (1995). Neural network for pattern recognition. Oxford : Clarendon Press.
Fu, L. (1994). Neural networks in computer intelligence. McGraw-Hill, Inc.
Gaweda, A. E., Setiono, R., and Zurada, J. M. (2000). "Rule extraction from feed--forward neural network for function approximation." In: Proceedings of the 5th Conference on Neural Networks and Soft Computing, Zakopane, Poland, pp. 311-316.
Gill, P. E., Mao, Z. H., and Li, Y. D. (1981). Practical optimization. New York: Aca-demic.
Hertz, J., Krogh, A. and Palmer, R. G. (1991). Introduction to the theory of neural computation, Redwood City, CA: Addison Wesley.
Hogg, R. V., Tains, E. A. (1997a). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 394-455.
Hogg, R. V., Tains, E. A. (1997b). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 608-614.
Karnin, E. D. (1990). "A simple procedure for pruning back-propagation trained neural networks." IEEE Transactions on Neural Networks, Vol. 1, No. 2, pp.239-242.
Kerber, R. (1992). "ChiMerge: Discretization of numeric attributes." In: Proceedings Ninth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 123-128.
Lloyd, S. P. (1982). "Least squares quantization in PCM." IEEE Transactions on In-formation Theory, Vol. 28, No. 2, 129-137.
Liu, H., and Setiono, R. (1995). "Chi2: Feature selection and discretization of nu-meric attributes." In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388-391.
Liu, H., and Tan, S. T. (1995). "X2R: A fast rule generator." In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, IEEE Press.
Malkiel, B. G., (1962). "Expectations, bond prices, and the term structure of interest rates." Quarterly Journal of Economics, Vol 76, No. 2, pp.197-218.
Murty, K. G., (1983). Linear Programming. New York: John Wiley & Sons, pp.91-181.
Neter, J., Kuter, M.H., Nachtsheim C.J., and Wasserman W. (1996). Applied linear regression models─3rd ed. Richard D. Irwin, pp. 640.
Quinlan, J. R. (1993), C4.5: Programs for machine learning. Sam Mateo, CA: Morgan Kaufmann.
Rosenblatt, F. (1958), "The perceptron: a probabilistic model for information storage and organization in the brain." Psychological Review, Vol. 65, pp. 386-408.
Rumelhart, D.E., Hinton, G.E., and Williams, R. (1986). "Learning internal repre-sentation by error propagation." Parallel Distributed Processing. Cambridge, MA: MIT Press, Vol. 1, pp. 318-362.
Saito, K., and Nakano R. (2002). "Extracting regression rules from neural networks." Neural Network, Vol. 15, No. 10, pp. 1297-1288.
Saito, K., and Nakano R. (2000). "Discovery of relevant weight by minimizing cross-validation error." In: Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 372-375.
Seber, G.A.F., and Wild, C.J. (1989). Nonlinear regression. New York: John Wiley & Sons, pp. 465-471
Setiono, R., Leow, W. K., and Zurada, J. M. (2002). "Extraction of rules from artifi-cial neural networks for nonlinear regression," IEEE Transactions on Neural Networks, Vol. 13, No. 3, pp. 564-577.
Setiono, R. (1997). "A penalty function approach for pruning feed-forward neural networks." Neural Computation, Vol. 9, No. 1, pp.185-204.
Setiono, R., and Liu. H. (1997). "NeuroLinear: From neural networks to oblique de-cision rules." Neurocomputing, Vol. 17, No. 1, pp. 1-24.
Setiono, R., and Liu, H. (1996). "Symbolic representation of neural networks." IEEE Computer, Vol. 29,. No. 3, pp. 71-77.
Sharpe, W. F. and Alexander, G. J. (1990). Investments-the fourth edition. New Jer-sey: Prentice-Hall, Inc, pp.382-384.
Simth, M. (1993). Nerual networks for statistical modeling. New York: Van Nostrand Reinhold, pp.167.
Stone, M. (1974). "Cross-validatory choice and assessment of statistical predictions (with discussion)." Journal of the Royal Statistical Society B, Vol. 36, No. 1, pp.111-147.
Taha, I. A., and Ghosh, J. (1999). "Symbolic interpretation of artificial neural net-works." IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 3, pp.448-463.
Taha, I. A., and Ghosh, J. (1996). "Three techniques for extracting rule from feed-forward networks." In: Dagli, C. H., Akay, M., Fernandez, B., Chen, C. L. P., Ghosh J. (Eds). Intelligent Engineering System Through Artificial Neural Networks (Volume 6), St. Louis: ASME Press, pp.23-28.
The MathWorks, Inc. (2002). Optimization Toolbox User’s Guide. [Online]. Avail-able: http://www.mathworks.com/access/helpdesk/help/pdf_doc/optim/optim_tb. pdf
Towell, G., and Shavlik, J. (1993). "The extraction of refined rules from knowl-edge-based neural networks." Machine Learning, Vol. 13, No. 1, pp. 71-101.
Vapnik, V. (1995). "The nature of statistical learning theory." New York: Springer-Verlag.
Van Ooyen, A., Nienhuis, B. (1992). "Improving the convergence of the backpropa-gation algorithm." Neural Networks, Vol. 5, No. 3, pp.465-471.
Weijters, T., and Bosch, A. V. D. (1998). "Interpretable neural networks with BP-SOM," In: Tasks and Methods in Applied Artificial Intelligence. Lecture Notes in Artificial Intelligence 1416(A. del Pobil, J. Mira, and M. Ali, eds.), Ber-lin: Springer, pp. 564-573.
Zhou, R. R., Chen, S. F., and Chen, Z. Q. (2000). "A statistics based approach for ex-tracting priority rules from trained neural networks." In: Proceedings of the IEEE-INNS-ENNS International Join Conference on Neural Network, Como, It-aly, Vol. 3, pp. 401-406.
描述 碩士
國立政治大學
資訊管理研究所
90356002
91
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0090356002
資料類型 thesis
dc.contributor.advisor 蔡瑞煌zh_TW
dc.contributor.advisor Tsahi Rayen_US
dc.contributor.author (作者) 柯文乾zh_TW
dc.contributor.author (作者) Ke, Wen-Chyanen_US
dc.creator (作者) 柯文乾zh_TW
dc.creator (作者) Ke, Wen-Chyanen_US
dc.date (日期) 2002en_US
dc.date.accessioned 18-九月-2009 14:22:32 (UTC+8)-
dc.date.available 18-九月-2009 14:22:32 (UTC+8)-
dc.date.issued (上傳時間) 18-九月-2009 14:22:32 (UTC+8)-
dc.identifier (其他 識別碼) G0090356002en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/35180-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理研究所zh_TW
dc.description (描述) 90356002zh_TW
dc.description (描述) 91zh_TW
dc.description.abstract (摘要) 神經網路已經被成功地應用於解決各種分類及函數近似的問題,尤其因為神經網路是個萬能的近似器(universal approximator),所以對於函數近似的問題效果更為顯著。以往對於此類問題雖然多數以線性的分析工具為主,但是實際上多數問題本質上是非線性的,所以對於非線性分析工具的需求其實是很大的。自1986年起,神經網路本身的運作一直被視為一個黑箱作業,難以判斷網路學習結果的合理性,更無法有效地幫助使用者增進其知識,因此提供一套合理及有效的神經網路分析方法是重要。
本文提出一套分析神網路系統的方法;利用線性規劃的技巧萃取及分析網路中的規則(rule),而不需要對任何資料集做分析;進而利用統計無母數方法-符號檢定-歸納出網路中的知識。以債券評價為例,驗證此方法的可行性,實證結果亦顯示此方法所萃取出來的規則是合理的,且由這些萃取出的規則中,所歸納出來有關債券評價的知識多數是合理的。
zh_TW
dc.description.abstract (摘要) Neural networks have been successfully applied to solve a variety of application problems including classification and function approximation. They are especially useful for function approximation problems because they have been shown to be uni-versal approximators. In the past, for function approximation problems, they were mainly analyzed via tools of linear analyses. However, most of the function approxi-mation problems needed tools of nonlinear analyses in fact. Thus, there is the much demand for tools of nonlinear analyses. Since 1986, the neural network is considered a black box. It is hard to determine if the learning result of a neural network is rea-sonable, and the network can not effectively help users to develop the domain knowl-edge. Thus, it is important to supply a reasonable and effective analytic method of the neural network.
Here, we propose an analytic method of the neural network. It can extract rules from the neural network and analyze them via the Linear Programming and does not depend on any data analysis. Then we can generalize domain knowledge from these rules via the sign test, a statistical non-parameter method. We take the bond-pricing as an instance to examine the feasibility of our proposed method. The result shows that these extracted rules are reasonable by our method and that these generalized domain knowledge from these rules is also reasonable.
en_US
dc.description.tableofcontents Contents
1. Introduction 1
2. Related Works 4
2.1 Cluster Analysis 4
2.1.1 Chi2 Algorithm (Liu and Setiono, 1995) 4
2.2 Rule Extraction Techniques 6
2.2.1 NeuroLinear (Setiono and Liu, 1997) 6
Neural Network Training and Pruning 6
Rule Generation 8
2.2.2 STARE (Zhou et al., 2000) 10
Data Generation 10
Continuous Attribute Processing 10
Rule Creation 11
Priority Formation 12
Fidelity Evaluation 13
2.2.3 CREFANN (Gaweda et al., 2000) 15
Rule Extraction Algorithm 15
Rule-based Approximation Algorithm 16
2.2.4 REFANN (Setiono et al., 2002). 17
Neural Network Training and Pruning Algorithm 17
To Approximate Hidden Node Activation Function 20
Rule Generation 22
2.2.5 RN2 (Satio and Nakano, 2002) 24
Neural Network Training 24
Method for Rule Extraction 26
2.3 Discussion 29
3. Methodology 32
3.1 Definition 32
3.2 Method of Extracting Rules from Neural Networks 33
3.2.1 The Approximation of Hidden Node Activation Function 33
3.2.2 The Differential Analysis of Rules 36
3.2.3 The Rule Extraction Process 38
4. Empirical Study 40
4.1 Bond Pricing 40
4.2 Data Collection and Method Application 42
4.3 Results and Analysis 49
5. Conclusions and Future Work 52
5.1 Conclusions 52
5.2 Future Work 53
References 54
zh_TW
dc.format.extent 128892 bytes-
dc.format.extent 88876 bytes-
dc.format.extent 154079 bytes-
dc.format.extent 131512 bytes-
dc.format.extent 213376 bytes-
dc.format.extent 384595 bytes-
dc.format.extent 255186 bytes-
dc.format.extent 304246 bytes-
dc.format.extent 159078 bytes-
dc.format.extent 158365 bytes-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0090356002en_US
dc.subject (關鍵詞) 知識萃取zh_TW
dc.subject (關鍵詞) 規則萃取zh_TW
dc.subject (關鍵詞) 法則萃取zh_TW
dc.subject (關鍵詞) 債券評價zh_TW
dc.subject (關鍵詞) knowledge extractionen_US
dc.subject (關鍵詞) rule extractionen_US
dc.subject (關鍵詞) bond-pricingen_US
dc.title (題名) The Rule Extraction from Multi-layer Feed-forward Neural Networkszh_TW
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) Bishop, C. M. (1995). Neural network for pattern recognition. Oxford : Clarendon Press.zh_TW
dc.relation.reference (參考文獻) Fu, L. (1994). Neural networks in computer intelligence. McGraw-Hill, Inc.zh_TW
dc.relation.reference (參考文獻) Gaweda, A. E., Setiono, R., and Zurada, J. M. (2000). "Rule extraction from feed--forward neural network for function approximation." In: Proceedings of the 5th Conference on Neural Networks and Soft Computing, Zakopane, Poland, pp. 311-316.zh_TW
dc.relation.reference (參考文獻) Gill, P. E., Mao, Z. H., and Li, Y. D. (1981). Practical optimization. New York: Aca-demic.zh_TW
dc.relation.reference (參考文獻) Hertz, J., Krogh, A. and Palmer, R. G. (1991). Introduction to the theory of neural computation, Redwood City, CA: Addison Wesley.zh_TW
dc.relation.reference (參考文獻) Hogg, R. V., Tains, E. A. (1997a). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 394-455.zh_TW
dc.relation.reference (參考文獻) Hogg, R. V., Tains, E. A. (1997b). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 608-614.zh_TW
dc.relation.reference (參考文獻) Karnin, E. D. (1990). "A simple procedure for pruning back-propagation trained neural networks." IEEE Transactions on Neural Networks, Vol. 1, No. 2, pp.239-242.zh_TW
dc.relation.reference (參考文獻) Kerber, R. (1992). "ChiMerge: Discretization of numeric attributes." In: Proceedings Ninth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 123-128.zh_TW
dc.relation.reference (參考文獻) Lloyd, S. P. (1982). "Least squares quantization in PCM." IEEE Transactions on In-formation Theory, Vol. 28, No. 2, 129-137.zh_TW
dc.relation.reference (參考文獻) Liu, H., and Setiono, R. (1995). "Chi2: Feature selection and discretization of nu-meric attributes." In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388-391.zh_TW
dc.relation.reference (參考文獻) Liu, H., and Tan, S. T. (1995). "X2R: A fast rule generator." In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, IEEE Press.zh_TW
dc.relation.reference (參考文獻) Malkiel, B. G., (1962). "Expectations, bond prices, and the term structure of interest rates." Quarterly Journal of Economics, Vol 76, No. 2, pp.197-218.zh_TW
dc.relation.reference (參考文獻) Murty, K. G., (1983). Linear Programming. New York: John Wiley & Sons, pp.91-181.zh_TW
dc.relation.reference (參考文獻) Neter, J., Kuter, M.H., Nachtsheim C.J., and Wasserman W. (1996). Applied linear regression models─3rd ed. Richard D. Irwin, pp. 640.zh_TW
dc.relation.reference (參考文獻) Quinlan, J. R. (1993), C4.5: Programs for machine learning. Sam Mateo, CA: Morgan Kaufmann.zh_TW
dc.relation.reference (參考文獻) Rosenblatt, F. (1958), "The perceptron: a probabilistic model for information storage and organization in the brain." Psychological Review, Vol. 65, pp. 386-408.zh_TW
dc.relation.reference (參考文獻) Rumelhart, D.E., Hinton, G.E., and Williams, R. (1986). "Learning internal repre-sentation by error propagation." Parallel Distributed Processing. Cambridge, MA: MIT Press, Vol. 1, pp. 318-362.zh_TW
dc.relation.reference (參考文獻) Saito, K., and Nakano R. (2002). "Extracting regression rules from neural networks." Neural Network, Vol. 15, No. 10, pp. 1297-1288.zh_TW
dc.relation.reference (參考文獻) Saito, K., and Nakano R. (2000). "Discovery of relevant weight by minimizing cross-validation error." In: Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 372-375.zh_TW
dc.relation.reference (參考文獻) Seber, G.A.F., and Wild, C.J. (1989). Nonlinear regression. New York: John Wiley & Sons, pp. 465-471zh_TW
dc.relation.reference (參考文獻) Setiono, R., Leow, W. K., and Zurada, J. M. (2002). "Extraction of rules from artifi-cial neural networks for nonlinear regression," IEEE Transactions on Neural Networks, Vol. 13, No. 3, pp. 564-577.zh_TW
dc.relation.reference (參考文獻) Setiono, R. (1997). "A penalty function approach for pruning feed-forward neural networks." Neural Computation, Vol. 9, No. 1, pp.185-204.zh_TW
dc.relation.reference (參考文獻) Setiono, R., and Liu. H. (1997). "NeuroLinear: From neural networks to oblique de-cision rules." Neurocomputing, Vol. 17, No. 1, pp. 1-24.zh_TW
dc.relation.reference (參考文獻) Setiono, R., and Liu, H. (1996). "Symbolic representation of neural networks." IEEE Computer, Vol. 29,. No. 3, pp. 71-77.zh_TW
dc.relation.reference (參考文獻) Sharpe, W. F. and Alexander, G. J. (1990). Investments-the fourth edition. New Jer-sey: Prentice-Hall, Inc, pp.382-384.zh_TW
dc.relation.reference (參考文獻) Simth, M. (1993). Nerual networks for statistical modeling. New York: Van Nostrand Reinhold, pp.167.zh_TW
dc.relation.reference (參考文獻) Stone, M. (1974). "Cross-validatory choice and assessment of statistical predictions (with discussion)." Journal of the Royal Statistical Society B, Vol. 36, No. 1, pp.111-147.zh_TW
dc.relation.reference (參考文獻) Taha, I. A., and Ghosh, J. (1999). "Symbolic interpretation of artificial neural net-works." IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 3, pp.448-463.zh_TW
dc.relation.reference (參考文獻) Taha, I. A., and Ghosh, J. (1996). "Three techniques for extracting rule from feed-forward networks." In: Dagli, C. H., Akay, M., Fernandez, B., Chen, C. L. P., Ghosh J. (Eds). Intelligent Engineering System Through Artificial Neural Networks (Volume 6), St. Louis: ASME Press, pp.23-28.zh_TW
dc.relation.reference (參考文獻) The MathWorks, Inc. (2002). Optimization Toolbox User’s Guide. [Online]. Avail-able: http://www.mathworks.com/access/helpdesk/help/pdf_doc/optim/optim_tb. pdfzh_TW
dc.relation.reference (參考文獻) Towell, G., and Shavlik, J. (1993). "The extraction of refined rules from knowl-edge-based neural networks." Machine Learning, Vol. 13, No. 1, pp. 71-101.zh_TW
dc.relation.reference (參考文獻) Vapnik, V. (1995). "The nature of statistical learning theory." New York: Springer-Verlag.zh_TW
dc.relation.reference (參考文獻) Van Ooyen, A., Nienhuis, B. (1992). "Improving the convergence of the backpropa-gation algorithm." Neural Networks, Vol. 5, No. 3, pp.465-471.zh_TW
dc.relation.reference (參考文獻) Weijters, T., and Bosch, A. V. D. (1998). "Interpretable neural networks with BP-SOM," In: Tasks and Methods in Applied Artificial Intelligence. Lecture Notes in Artificial Intelligence 1416(A. del Pobil, J. Mira, and M. Ali, eds.), Ber-lin: Springer, pp. 564-573.zh_TW
dc.relation.reference (參考文獻) Zhou, R. R., Chen, S. F., and Chen, Z. Q. (2000). "A statistics based approach for ex-tracting priority rules from trained neural networks." In: Proceedings of the IEEE-INNS-ENNS International Join Conference on Neural Network, Como, It-aly, Vol. 3, pp. 401-406.zh_TW