The Rule Extraction from Multi-layer Feed-forward Neural Networks

Publications-Theses

Article View/Open

pdf(1096)pdf(1105)pdf(1220)pdf(1057)pdf(1306)pdf(1258)pdf(1135)pdf(1081)pdf(1044)pdf(1458)

Publication Export

Google Scholar^TM

題名	The Rule Extraction from Multi-layer Feed-forward Neural Networks
作者	柯文乾 Ke, Wen-Chyan
貢獻者	蔡瑞煌 Tsahi Ray 柯文乾 Ke, Wen-Chyan
關鍵詞	知識萃取規則萃取法則萃取債券評價 knowledge extraction rule extraction bond-pricing
日期	2002
上傳時間	18-Sep-2009 14:22:32 (UTC+8)
摘要	神經網路已經被成功地應用於解決各種分類及函數近似的問題，尤其因為神經網路是個萬能的近似器(universal approximator)，所以對於函數近似的問題效果更為顯著。以往對於此類問題雖然多數以線性的分析工具為主，但是實際上多數問題本質上是非線性的，所以對於非線性分析工具的需求其實是很大的。自1986年起，神經網路本身的運作一直被視為一個黑箱作業，難以判斷網路學習結果的合理性，更無法有效地幫助使用者增進其知識，因此提供一套合理及有效的神經網路分析方法是重要。本文提出一套分析神網路系統的方法；利用線性規劃的技巧萃取及分析網路中的規則(rule)，而不需要對任何資料集做分析；進而利用統計無母數方法－符號檢定－歸納出網路中的知識。以債券評價為例，驗證此方法的可行性，實證結果亦顯示此方法所萃取出來的規則是合理的，且由這些萃取出的規則中，所歸納出來有關債券評價的知識多數是合理的。 Neural networks have been successfully applied to solve a variety of application problems including classification and function approximation. They are especially useful for function approximation problems because they have been shown to be uni-versal approximators. In the past, for function approximation problems, they were mainly analyzed via tools of linear analyses. However, most of the function approxi-mation problems needed tools of nonlinear analyses in fact. Thus, there is the much demand for tools of nonlinear analyses. Since 1986, the neural network is considered a black box. It is hard to determine if the learning result of a neural network is rea-sonable, and the network can not effectively help users to develop the domain knowl-edge. Thus, it is important to supply a reasonable and effective analytic method of the neural network. Here, we propose an analytic method of the neural network. It can extract rules from the neural network and analyze them via the Linear Programming and does not depend on any data analysis. Then we can generalize domain knowledge from these rules via the sign test, a statistical non-parameter method. We take the bond-pricing as an instance to examine the feasibility of our proposed method. The result shows that these extracted rules are reasonable by our method and that these generalized domain knowledge from these rules is also reasonable.
參考文獻	Bishop, C. M. (1995). Neural network for pattern recognition. Oxford : Clarendon Press. Fu, L. (1994). Neural networks in computer intelligence. McGraw-Hill, Inc. Gaweda, A. E., Setiono, R., and Zurada, J. M. (2000). "Rule extraction from feed--forward neural network for function approximation." In: Proceedings of the 5th Conference on Neural Networks and Soft Computing, Zakopane, Poland, pp. 311-316. Gill, P. E., Mao, Z. H., and Li, Y. D. (1981). Practical optimization. New York: Aca-demic. Hertz, J., Krogh, A. and Palmer, R. G. (1991). Introduction to the theory of neural computation, Redwood City, CA: Addison Wesley. Hogg, R. V., Tains, E. A. (1997a). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 394-455. Hogg, R. V., Tains, E. A. (1997b). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 608-614. Karnin, E. D. (1990). "A simple procedure for pruning back-propagation trained neural networks." IEEE Transactions on Neural Networks, Vol. 1, No. 2, pp.239-242. Kerber, R. (1992). "ChiMerge: Discretization of numeric attributes." In: Proceedings Ninth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 123-128. Lloyd, S. P. (1982). "Least squares quantization in PCM." IEEE Transactions on In-formation Theory, Vol. 28, No. 2, 129-137. Liu, H., and Setiono, R. (1995). "Chi2: Feature selection and discretization of nu-meric attributes." In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388-391. Liu, H., and Tan, S. T. (1995). "X2R: A fast rule generator." In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, IEEE Press. Malkiel, B. G., (1962). "Expectations, bond prices, and the term structure of interest rates." Quarterly Journal of Economics, Vol 76, No. 2, pp.197-218. Murty, K. G., (1983). Linear Programming. New York: John Wiley & Sons, pp.91-181. Neter, J., Kuter, M.H., Nachtsheim C.J., and Wasserman W. (1996). Applied linear regression models─3rd ed. Richard D. Irwin, pp. 640. Quinlan, J. R. (1993), C4.5: Programs for machine learning. Sam Mateo, CA: Morgan Kaufmann. Rosenblatt, F. (1958), "The perceptron: a probabilistic model for information storage and organization in the brain." Psychological Review, Vol. 65, pp. 386-408. Rumelhart, D.E., Hinton, G.E., and Williams, R. (1986). "Learning internal repre-sentation by error propagation." Parallel Distributed Processing. Cambridge, MA: MIT Press, Vol. 1, pp. 318-362. Saito, K., and Nakano R. (2002). "Extracting regression rules from neural networks." Neural Network, Vol. 15, No. 10, pp. 1297-1288. Saito, K., and Nakano R. (2000). "Discovery of relevant weight by minimizing cross-validation error." In: Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 372-375. Seber, G.A.F., and Wild, C.J. (1989). Nonlinear regression. New York: John Wiley & Sons, pp. 465-471 Setiono, R., Leow, W. K., and Zurada, J. M. (2002). "Extraction of rules from artifi-cial neural networks for nonlinear regression," IEEE Transactions on Neural Networks, Vol. 13, No. 3, pp. 564-577. Setiono, R. (1997). "A penalty function approach for pruning feed-forward neural networks." Neural Computation, Vol. 9, No. 1, pp.185-204. Setiono, R., and Liu. H. (1997). "NeuroLinear: From neural networks to oblique de-cision rules." Neurocomputing, Vol. 17, No. 1, pp. 1-24. Setiono, R., and Liu, H. (1996). "Symbolic representation of neural networks." IEEE Computer, Vol. 29,. No. 3, pp. 71-77. Sharpe, W. F. and Alexander, G. J. (1990). Investments-the fourth edition. New Jer-sey: Prentice-Hall, Inc, pp.382-384. Simth, M. (1993). Nerual networks for statistical modeling. New York: Van Nostrand Reinhold, pp.167. Stone, M. (1974). "Cross-validatory choice and assessment of statistical predictions (with discussion)." Journal of the Royal Statistical Society B, Vol. 36, No. 1, pp.111-147. Taha, I. A., and Ghosh, J. (1999). "Symbolic interpretation of artificial neural net-works." IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 3, pp.448-463. Taha, I. A., and Ghosh, J. (1996). "Three techniques for extracting rule from feed-forward networks." In: Dagli, C. H., Akay, M., Fernandez, B., Chen, C. L. P., Ghosh J. (Eds). Intelligent Engineering System Through Artificial Neural Networks (Volume 6), St. Louis: ASME Press, pp.23-28. The MathWorks, Inc. (2002). Optimization Toolbox User’s Guide. [Online]. Avail-able: http://www.mathworks.com/access/helpdesk/help/pdf_doc/optim/optim_tb. pdf Towell, G., and Shavlik, J. (1993). "The extraction of refined rules from knowl-edge-based neural networks." Machine Learning, Vol. 13, No. 1, pp. 71-101. Vapnik, V. (1995). "The nature of statistical learning theory." New York: Springer-Verlag. Van Ooyen, A., Nienhuis, B. (1992). "Improving the convergence of the backpropa-gation algorithm." Neural Networks, Vol. 5, No. 3, pp.465-471. Weijters, T., and Bosch, A. V. D. (1998). "Interpretable neural networks with BP-SOM," In: Tasks and Methods in Applied Artificial Intelligence. Lecture Notes in Artificial Intelligence 1416(A. del Pobil, J. Mira, and M. Ali, eds.), Ber-lin: Springer, pp. 564-573. Zhou, R. R., Chen, S. F., and Chen, Z. Q. (2000). "A statistics based approach for ex-tracting priority rules from trained neural networks." In: Proceedings of the IEEE-INNS-ENNS International Join Conference on Neural Network, Como, It-aly, Vol. 3, pp. 401-406.
描述	碩士國立政治大學資訊管理研究所 90356002 91
資料來源	http://thesis.lib.nccu.edu.tw/record/#G0090356002
資料類型	thesis

dc.contributor.advisor	蔡瑞煌	zh_TW
dc.contributor.advisor	Tsahi Ray	en_US
dc.contributor.author (Authors)	柯文乾	zh_TW
dc.contributor.author (Authors)	Ke, Wen-Chyan	en_US
dc.creator (作者)	柯文乾	zh_TW
dc.creator (作者)	Ke, Wen-Chyan	en_US
dc.date (日期)	2002	en_US
dc.date.accessioned	18-Sep-2009 14:22:32 (UTC+8)	-
dc.date.available	18-Sep-2009 14:22:32 (UTC+8)	-
dc.date.issued (上傳時間)	18-Sep-2009 14:22:32 (UTC+8)	-
dc.identifier (Other Identifiers)	G0090356002	en_US
dc.identifier.uri (URI)	https://nccur.lib.nccu.edu.tw/handle/140.119/35180	-
dc.description (描述)	碩士	zh_TW
dc.description (描述)	國立政治大學	zh_TW
dc.description (描述)	資訊管理研究所	zh_TW
dc.description (描述)	90356002	zh_TW
dc.description (描述)	91	zh_TW
dc.description.abstract (摘要)	神經網路已經被成功地應用於解決各種分類及函數近似的問題，尤其因為神經網路是個萬能的近似器(universal approximator)，所以對於函數近似的問題效果更為顯著。以往對於此類問題雖然多數以線性的分析工具為主，但是實際上多數問題本質上是非線性的，所以對於非線性分析工具的需求其實是很大的。自1986年起，神經網路本身的運作一直被視為一個黑箱作業，難以判斷網路學習結果的合理性，更無法有效地幫助使用者增進其知識，因此提供一套合理及有效的神經網路分析方法是重要。本文提出一套分析神網路系統的方法；利用線性規劃的技巧萃取及分析網路中的規則(rule)，而不需要對任何資料集做分析；進而利用統計無母數方法－符號檢定－歸納出網路中的知識。以債券評價為例，驗證此方法的可行性，實證結果亦顯示此方法所萃取出來的規則是合理的，且由這些萃取出的規則中，所歸納出來有關債券評價的知識多數是合理的。	zh_TW
dc.description.abstract (摘要)	Neural networks have been successfully applied to solve a variety of application problems including classification and function approximation. They are especially useful for function approximation problems because they have been shown to be uni-versal approximators. In the past, for function approximation problems, they were mainly analyzed via tools of linear analyses. However, most of the function approxi-mation problems needed tools of nonlinear analyses in fact. Thus, there is the much demand for tools of nonlinear analyses. Since 1986, the neural network is considered a black box. It is hard to determine if the learning result of a neural network is rea-sonable, and the network can not effectively help users to develop the domain knowl-edge. Thus, it is important to supply a reasonable and effective analytic method of the neural network. Here, we propose an analytic method of the neural network. It can extract rules from the neural network and analyze them via the Linear Programming and does not depend on any data analysis. Then we can generalize domain knowledge from these rules via the sign test, a statistical non-parameter method. We take the bond-pricing as an instance to examine the feasibility of our proposed method. The result shows that these extracted rules are reasonable by our method and that these generalized domain knowledge from these rules is also reasonable.	en_US
dc.description.tableofcontents	Contents 1. Introduction 1 2. Related Works 4 2.1 Cluster Analysis 4 2.1.1 Chi2 Algorithm (Liu and Setiono, 1995) 4 2.2 Rule Extraction Techniques 6 2.2.1 NeuroLinear (Setiono and Liu, 1997) 6 Neural Network Training and Pruning 6 Rule Generation 8 2.2.2 STARE (Zhou et al., 2000) 10 Data Generation 10 Continuous Attribute Processing 10 Rule Creation 11 Priority Formation 12 Fidelity Evaluation 13 2.2.3 CREFANN (Gaweda et al., 2000) 15 Rule Extraction Algorithm 15 Rule-based Approximation Algorithm 16 2.2.4 REFANN (Setiono et al., 2002). 17 Neural Network Training and Pruning Algorithm 17 To Approximate Hidden Node Activation Function 20 Rule Generation 22 2.2.5 RN2 (Satio and Nakano, 2002) 24 Neural Network Training 24 Method for Rule Extraction 26 2.3 Discussion 29 3. Methodology 32 3.1 Definition 32 3.2 Method of Extracting Rules from Neural Networks 33 3.2.1 The Approximation of Hidden Node Activation Function 33 3.2.2 The Differential Analysis of Rules 36 3.2.3 The Rule Extraction Process 38 4. Empirical Study 40 4.1 Bond Pricing 40 4.2 Data Collection and Method Application 42 4.3 Results and Analysis 49 5. Conclusions and Future Work 52 5.1 Conclusions 52 5.2 Future Work 53 References 54	zh_TW
dc.format.extent	128892 bytes	-
dc.format.extent	88876 bytes	-
dc.format.extent	154079 bytes	-
dc.format.extent	131512 bytes	-
dc.format.extent	213376 bytes	-
dc.format.extent	384595 bytes	-
dc.format.extent	255186 bytes	-
dc.format.extent	304246 bytes	-
dc.format.extent	159078 bytes	-
dc.format.extent	158365 bytes	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.format.mimetype	application/pdf	-
dc.language.iso	en_US	-
dc.source.uri (資料來源)	http://thesis.lib.nccu.edu.tw/record/#G0090356002	en_US
dc.subject (關鍵詞)	知識萃取	zh_TW
dc.subject (關鍵詞)	規則萃取	zh_TW
dc.subject (關鍵詞)	法則萃取	zh_TW
dc.subject (關鍵詞)	債券評價	zh_TW
dc.subject (關鍵詞)	knowledge extraction	en_US
dc.subject (關鍵詞)	rule extraction	en_US
dc.subject (關鍵詞)	bond-pricing	en_US
dc.title (題名)	The Rule Extraction from Multi-layer Feed-forward Neural Networks	zh_TW
dc.type (資料類型)	thesis	en
dc.relation.reference (參考文獻)	Bishop, C. M. (1995). Neural network for pattern recognition. Oxford : Clarendon Press.	zh_TW
dc.relation.reference (參考文獻)	Fu, L. (1994). Neural networks in computer intelligence. McGraw-Hill, Inc.	zh_TW
dc.relation.reference (參考文獻)	Gaweda, A. E., Setiono, R., and Zurada, J. M. (2000). "Rule extraction from feed--forward neural network for function approximation." In: Proceedings of the 5th Conference on Neural Networks and Soft Computing, Zakopane, Poland, pp. 311-316.	zh_TW
dc.relation.reference (參考文獻)	Gill, P. E., Mao, Z. H., and Li, Y. D. (1981). Practical optimization. New York: Aca-demic.	zh_TW
dc.relation.reference (參考文獻)	Hertz, J., Krogh, A. and Palmer, R. G. (1991). Introduction to the theory of neural computation, Redwood City, CA: Addison Wesley.	zh_TW
dc.relation.reference (參考文獻)	Hogg, R. V., Tains, E. A. (1997a). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 394-455.	zh_TW
dc.relation.reference (參考文獻)	Hogg, R. V., Tains, E. A. (1997b). Probability and statistical inference-5th ed, New Jersey: Prentice Hall, pp. 608-614.	zh_TW
dc.relation.reference (參考文獻)	Karnin, E. D. (1990). "A simple procedure for pruning back-propagation trained neural networks." IEEE Transactions on Neural Networks, Vol. 1, No. 2, pp.239-242.	zh_TW
dc.relation.reference (參考文獻)	Kerber, R. (1992). "ChiMerge: Discretization of numeric attributes." In: Proceedings Ninth National Conference on Artificial Intelligence, Menlo Park, CA: AAAI Press, pp. 123-128.	zh_TW
dc.relation.reference (參考文獻)	Lloyd, S. P. (1982). "Least squares quantization in PCM." IEEE Transactions on In-formation Theory, Vol. 28, No. 2, 129-137.	zh_TW
dc.relation.reference (參考文獻)	Liu, H., and Setiono, R. (1995). "Chi2: Feature selection and discretization of nu-meric attributes." In: Proceedings of the Seventh International Conference on Tools with Artificial Intelligence, pp. 388-391.	zh_TW
dc.relation.reference (參考文獻)	Liu, H., and Tan, S. T. (1995). "X2R: A fast rule generator." In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, IEEE Press.	zh_TW
dc.relation.reference (參考文獻)	Malkiel, B. G., (1962). "Expectations, bond prices, and the term structure of interest rates." Quarterly Journal of Economics, Vol 76, No. 2, pp.197-218.	zh_TW
dc.relation.reference (參考文獻)	Murty, K. G., (1983). Linear Programming. New York: John Wiley & Sons, pp.91-181.	zh_TW
dc.relation.reference (參考文獻)	Neter, J., Kuter, M.H., Nachtsheim C.J., and Wasserman W. (1996). Applied linear regression models─3rd ed. Richard D. Irwin, pp. 640.	zh_TW
dc.relation.reference (參考文獻)	Quinlan, J. R. (1993), C4.5: Programs for machine learning. Sam Mateo, CA: Morgan Kaufmann.	zh_TW
dc.relation.reference (參考文獻)	Rosenblatt, F. (1958), "The perceptron: a probabilistic model for information storage and organization in the brain." Psychological Review, Vol. 65, pp. 386-408.	zh_TW
dc.relation.reference (參考文獻)	Rumelhart, D.E., Hinton, G.E., and Williams, R. (1986). "Learning internal repre-sentation by error propagation." Parallel Distributed Processing. Cambridge, MA: MIT Press, Vol. 1, pp. 318-362.	zh_TW
dc.relation.reference (參考文獻)	Saito, K., and Nakano R. (2002). "Extracting regression rules from neural networks." Neural Network, Vol. 15, No. 10, pp. 1297-1288.	zh_TW
dc.relation.reference (參考文獻)	Saito, K., and Nakano R. (2000). "Discovery of relevant weight by minimizing cross-validation error." In: Proceedings of the Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 372-375.	zh_TW
dc.relation.reference (參考文獻)	Seber, G.A.F., and Wild, C.J. (1989). Nonlinear regression. New York: John Wiley & Sons, pp. 465-471	zh_TW
dc.relation.reference (參考文獻)	Setiono, R., Leow, W. K., and Zurada, J. M. (2002). "Extraction of rules from artifi-cial neural networks for nonlinear regression," IEEE Transactions on Neural Networks, Vol. 13, No. 3, pp. 564-577.	zh_TW
dc.relation.reference (參考文獻)	Setiono, R. (1997). "A penalty function approach for pruning feed-forward neural networks." Neural Computation, Vol. 9, No. 1, pp.185-204.	zh_TW
dc.relation.reference (參考文獻)	Setiono, R., and Liu. H. (1997). "NeuroLinear: From neural networks to oblique de-cision rules." Neurocomputing, Vol. 17, No. 1, pp. 1-24.	zh_TW
dc.relation.reference (參考文獻)	Setiono, R., and Liu, H. (1996). "Symbolic representation of neural networks." IEEE Computer, Vol. 29,. No. 3, pp. 71-77.	zh_TW
dc.relation.reference (參考文獻)	Sharpe, W. F. and Alexander, G. J. (1990). Investments-the fourth edition. New Jer-sey: Prentice-Hall, Inc, pp.382-384.	zh_TW
dc.relation.reference (參考文獻)	Simth, M. (1993). Nerual networks for statistical modeling. New York: Van Nostrand Reinhold, pp.167.	zh_TW
dc.relation.reference (參考文獻)	Stone, M. (1974). "Cross-validatory choice and assessment of statistical predictions (with discussion)." Journal of the Royal Statistical Society B, Vol. 36, No. 1, pp.111-147.	zh_TW
dc.relation.reference (參考文獻)	Taha, I. A., and Ghosh, J. (1999). "Symbolic interpretation of artificial neural net-works." IEEE Transactions on Knowledge and Data Engineering, Vol. 11, No. 3, pp.448-463.	zh_TW
dc.relation.reference (參考文獻)	Taha, I. A., and Ghosh, J. (1996). "Three techniques for extracting rule from feed-forward networks." In: Dagli, C. H., Akay, M., Fernandez, B., Chen, C. L. P., Ghosh J. (Eds). Intelligent Engineering System Through Artificial Neural Networks (Volume 6), St. Louis: ASME Press, pp.23-28.	zh_TW
dc.relation.reference (參考文獻)	The MathWorks, Inc. (2002). Optimization Toolbox User’s Guide. [Online]. Avail-able: http://www.mathworks.com/access/helpdesk/help/pdf_doc/optim/optim_tb. pdf	zh_TW
dc.relation.reference (參考文獻)	Towell, G., and Shavlik, J. (1993). "The extraction of refined rules from knowl-edge-based neural networks." Machine Learning, Vol. 13, No. 1, pp. 71-101.	zh_TW
dc.relation.reference (參考文獻)	Vapnik, V. (1995). "The nature of statistical learning theory." New York: Springer-Verlag.	zh_TW
dc.relation.reference (參考文獻)	Van Ooyen, A., Nienhuis, B. (1992). "Improving the convergence of the backpropa-gation algorithm." Neural Networks, Vol. 5, No. 3, pp.465-471.	zh_TW
dc.relation.reference (參考文獻)	Weijters, T., and Bosch, A. V. D. (1998). "Interpretable neural networks with BP-SOM," In: Tasks and Methods in Applied Artificial Intelligence. Lecture Notes in Artificial Intelligence 1416(A. del Pobil, J. Mira, and M. Ali, eds.), Ber-lin: Springer, pp. 564-573.	zh_TW
dc.relation.reference (參考文獻)	Zhou, R. R., Chen, S. F., and Chen, Z. Q. (2000). "A statistics based approach for ex-tracting priority rules from trained neural networks." In: Proceedings of the IEEE-INNS-ENNS International Join Conference on Neural Network, Como, It-aly, Vol. 3, pp. 401-406.	zh_TW

Publications-Theses

Article View/Open

Publication Export

Google ScholarTM

NCCU Library

Citation Infomation

Related Publications in TAIR

Google Scholar^TM