學術產出-Theses

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 中文資訊擷取結果之錯誤偵測
Error Detection on Chinese Information Extraction Results
作者 鄭雍瑋
Cheng, Yung-Wei
貢獻者 劉吉軒
Liu, Jyi-Shane
鄭雍瑋
Cheng, Yung-Wei
關鍵詞 錯誤偵測
資訊擷取
文本資料描述
Error Detection
Information Extraction
Textual Data Profiling
日期 2005
上傳時間 17-Sep-2009 13:56:10 (UTC+8)
摘要 資訊擷取是從自然語言文本中辨識出特定的主題或事件的描述,進而萃取出相關主題或事件元素中的對應資訊,再將其擷取之結果彙整至資料庫中,便能將自然語言文件轉換成結構化的核心資訊。然而資訊擷取技術的結果會有錯誤情況發生,若單只依靠人工檢查及更正錯誤的方式進行,將會是耗費大量人力及時間的工作。
在本研究論文中,我們提出字串圖形結構與字串特徵值兩種錯誤資料偵測方法。前者是透過圖形結構比對各資料內字元及字元間關聯,接著由公式計算出每筆資料的比對分數,藉由分數高低可判斷是否為錯誤資料;後者則是利用字串特徵值,來描述字串外表特徵,再透過SVM和C4.5機器學習分類方法歸納出決策樹,進而分類正確與錯誤二元資料。而此兩種偵測方法的差異在於前者隱含了圖學理論之節點位置與鄰點概念,直接比對原始字串內容;後者則是將原始字串轉換成特徵數值,進行分類等動作。
在實驗方面,我們以「總統府人事任免公報」之資訊擷取成果資料庫作為測試資料。實驗結果顯示,本研究所提出的錯誤偵測方法可以有效偵測出不合格的值組,不但能節省驗證資料所花費的成本,甚至可確保高資料品質的資訊擷取成果產出,促使資訊擷取技術更廣泛的實際應用。
Given a targeted subject and a text collection, information extraction techniques provide the capability to populate a database in which each record entry is a subject instance documented in the text collection. However, even with the state-of-the-art IE techniques, IE task results are expected to contain errors. Manual error detection and correction are labor intensive and time consuming. This validation cost remains a major obstacle to actual deployment of practical IE applications with high validity requirement.
In this paper, we propose string graph structure and string feature-based methods. The former takes advantage of graph structure to compare characters and the relation between characters. Next step, we count the corresponding score via formula, and then the scores are takes to estimate the data correctness. The latter uses string features to describe a certain characteristics of each string, after that decision tree is generated by the C4.5 and SVM machine learning algorithms. And then classify the data is valid or not. These two detection methods have the ability to describe the feature of data and verify the correctness further. The difference between these two methods is that, we deal with string of row data directly in the previous method. Besides, it indicates the concept of node position and neighbor node in graphic theory. By contrast, the row string was transformed into feature value, and then be classified in the latter method.
In our experiments, we use IE task results of government personnel directives as test data. We conducted experiments to verify that effective detection of IE invalid values can be achieved by using the string graph structure and string feature-based methods. The contribution of our work is to reduce validation cost and enhance the quality of IE results, even provide both analytical and empirical evidences for supporting the effective enhancement of IE results usability as well.
參考文獻 [1] Paulson, L. D., “Data Quality: a Rising e-Business Concern,” IT Professional, Vol. 2 No. 4, July-Aug. 2000, pp.10–14.
[2] Rahm, E. and Do, H.-H., “Data Cleaning: Problems and Current Approaches,” IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 23, No. 4, December 2000.
[3] 翁家緯,“以型態辨識為主的中文資訊擷取技術研究”,國立政治大學資訊科學系碩士論文,2003。
[4] Message Understanding Conference, URL: http://www.muc.saic.com
[5] Text Retrieval Conference, URL: http://trec.nist.gov
[6] Jim Cowie, Wendy Lehnert. 1996. Information Extraction, Communications of the ACM(CACM), 39(1),pp.80-91
[7] Applet, D. E. and Israel, D.J. 1999. Introduction to Information extraction Technology. In Proceedings of the 16th International Joint Conference on Artificial Intelligence.
[8] Peng, F. Models Development in IE tasks – A survey. 1999. CS685 (Intelligent Computer Interface) course project, Computer Science Department, University of Waterloo.
[9] Ellen Riloff. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. Proceeding for the Eleventh National Conference on Artificial Intelligence, pp.811-816.
[10] Ellen Riloff. 1996. Automatically Generating Extraction Patterns from Untagged Text. In Proceedings of the Thriteenth National Conference on Artificial Intelligence, pp.1044-1049.
[11] Califf, M. E. and Mooney R.J. 1999. Relational Learning of Pattern- match Rules for Information Extraction. In Proceedings of the 16th National Conference on AI, pp.328-334.
[12] Kushmerick, N. Weld, D. and Doorenbos, R. 1997. Wrapper Induction for information extraction. In Proceedings of the 15th International Joint Conference on AI (IJCAI-97), pp. 729-737.
[13] Kushmerick, N. 1998. Wrapper Induction: Efficiency and Expressiveness. Workshop on AI & Information Integration. In Proceedings of AAAI-98 Workshop on Artification Intelligence and Information Integration, pp. 15-68, AAAI Press, Menlo Park, California.
[14] Chun-Nan Hsu and Ming-Tzung Dung. Aug 1998. Generating Finite-State Transducers for Semi-Structured Data Extraction from The Web, Journal of Infromation Systems, Special Issue on Semi-structured Data, Vol.23, No.8, pp. 521-538.
[15] Chun-Nan Hsu and Chien-Chi Chang. 1999. Finite-state Transducers for Semi-structured Text Mining. In Proceedings of IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Applications, pp. 38-49, Stockholm, Sweden.
[16] Jyi-Shane Liu, Mu-Hsi. Tseng. November 2001. Extracting Government Personnel Information from Official Gazettes. In Proceedings of the Sixth Conference on Artificial Intelligence and Applications, pp. 593-598, Kaoshiung, Taiwan.
[17] Oman, R. C. and Ayers, T. B. “Improving Data Quality,” Journal of Systems management, May 1988, pp.31-35.
[18] Tayi, G. K. and Ballou, D. P. “Examining Data Quality,” Communications of the ACM (41:2), Feb. 1998, pp.54-57.
[19] Ballou, D. P. and Pazer, H. L. “Implication of Data Quality for Spreadsheet Analysis,” Data Base, Spr. 1987, pp.13-19.
[20] Redman, T.C. Data Quality for the Information Age, Artech House, Inc., 1996. Redman, T.C. “The Impact of Poor Data Quality on the Typical Enterprise,” Communications of the ACM (41:2), Feb. 1998, pp.79-82.
[21] Brauer, B., “Data Quality –Spinning Straw Into Gold,” Available [Online] at: http://www2.sas.com/proceedings/sugi26/p117-26.pdf, 2000.
[22] Muller, H., and Freytag, J. C. Problems, Methods, and Challenges in Comprehensive Data Cleansing. Technical Report HUB-IB-164, Humboldt University Berlin, 2003.
[23] V. Raman and J. M. Hellerstein, An Interactive Framework for Data Cleaning, UC Berkeley Computer Science Division Report No. UCB/CSD00/1110, September 2000.
[24] H. Galhardas, D. Florescu and D. Shasha, An Extensible Framework for Data Cleaning, INRIA Technical Report, 1999.
[25] Kaufman, L. and Rousseeus, P. J., Finding Groups in Data: An
introduction to Cluster Analysis, New York: John Wiley & Sons, 1990.
[26] 李念秋,“資料品質改善之研究:錯誤資料偵測技術之發展與評估”,國立中山大學資訊管理系碩士論文,2002。
[27] Quinlan, J. R., “Induction of Decision Tree,” Machine Learning, Vol. 1, 1986, pp.81-106.
[28] Quinlan, J. R., C4.5: Programs for Machine Learning, Morgen Kaufmann Publishers, San Mateo, CA, 1993.
[29] Chan, P. K., Fan, W., Prodromidis, A. L., and Stolfo, S. J.,“Distributed Data Mining in Credit Card Fraud Detection,” IEEE Intelligent Systems, Vol. 14, No. 6, 1999, pp.67-74.
[30] N.Cristianini, J. Shawf-Taylor. An Introduction to Support Vector Machines and
other kernel-based learning methods,Cambridge University Press,2000.
[31] V. Vapnik. Statistical Learning Theory. Wiley, 1998.
[32] Elmasri, R., and Navathe, S., Fundamentals Of Database Systems, 3rd edition , 2000.
[33] LIBSVM, http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html, URL:http://www.csie.ntu.edu.tw/~r91034/svm/svm_tutorial.html
[34] Redman, T., Data Quality for the Information Age, Artech House, Boston, 1996.
[35] 總統府人事任免公報,URL:http://www.president.gov.tw/2_report/layer2.html
[36] Maletic, J.I. and Marcus, A., Data Cleansing: Beyond Integrity Analysis. Proceedings of the Conference on Information Quality (IQ2000), Boston, October 2000.
[37] 立法院新聞知識管理系統,URL: http://nplnews.ly.gov.tw/index.jsp
描述 碩士
國立政治大學
資訊科學學系
93753006
94
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0093753006
資料類型 thesis
dc.contributor.advisor 劉吉軒zh_TW
dc.contributor.advisor Liu, Jyi-Shaneen_US
dc.contributor.author (Authors) 鄭雍瑋zh_TW
dc.contributor.author (Authors) Cheng, Yung-Weien_US
dc.creator (作者) 鄭雍瑋zh_TW
dc.creator (作者) Cheng, Yung-Weien_US
dc.date (日期) 2005en_US
dc.date.accessioned 17-Sep-2009 13:56:10 (UTC+8)-
dc.date.available 17-Sep-2009 13:56:10 (UTC+8)-
dc.date.issued (上傳時間) 17-Sep-2009 13:56:10 (UTC+8)-
dc.identifier (Other Identifiers) G0093753006en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/32649-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊科學學系zh_TW
dc.description (描述) 93753006zh_TW
dc.description (描述) 94zh_TW
dc.description.abstract (摘要) 資訊擷取是從自然語言文本中辨識出特定的主題或事件的描述,進而萃取出相關主題或事件元素中的對應資訊,再將其擷取之結果彙整至資料庫中,便能將自然語言文件轉換成結構化的核心資訊。然而資訊擷取技術的結果會有錯誤情況發生,若單只依靠人工檢查及更正錯誤的方式進行,將會是耗費大量人力及時間的工作。
在本研究論文中,我們提出字串圖形結構與字串特徵值兩種錯誤資料偵測方法。前者是透過圖形結構比對各資料內字元及字元間關聯,接著由公式計算出每筆資料的比對分數,藉由分數高低可判斷是否為錯誤資料;後者則是利用字串特徵值,來描述字串外表特徵,再透過SVM和C4.5機器學習分類方法歸納出決策樹,進而分類正確與錯誤二元資料。而此兩種偵測方法的差異在於前者隱含了圖學理論之節點位置與鄰點概念,直接比對原始字串內容;後者則是將原始字串轉換成特徵數值,進行分類等動作。
在實驗方面,我們以「總統府人事任免公報」之資訊擷取成果資料庫作為測試資料。實驗結果顯示,本研究所提出的錯誤偵測方法可以有效偵測出不合格的值組,不但能節省驗證資料所花費的成本,甚至可確保高資料品質的資訊擷取成果產出,促使資訊擷取技術更廣泛的實際應用。
zh_TW
dc.description.abstract (摘要) Given a targeted subject and a text collection, information extraction techniques provide the capability to populate a database in which each record entry is a subject instance documented in the text collection. However, even with the state-of-the-art IE techniques, IE task results are expected to contain errors. Manual error detection and correction are labor intensive and time consuming. This validation cost remains a major obstacle to actual deployment of practical IE applications with high validity requirement.
In this paper, we propose string graph structure and string feature-based methods. The former takes advantage of graph structure to compare characters and the relation between characters. Next step, we count the corresponding score via formula, and then the scores are takes to estimate the data correctness. The latter uses string features to describe a certain characteristics of each string, after that decision tree is generated by the C4.5 and SVM machine learning algorithms. And then classify the data is valid or not. These two detection methods have the ability to describe the feature of data and verify the correctness further. The difference between these two methods is that, we deal with string of row data directly in the previous method. Besides, it indicates the concept of node position and neighbor node in graphic theory. By contrast, the row string was transformed into feature value, and then be classified in the latter method.
In our experiments, we use IE task results of government personnel directives as test data. We conducted experiments to verify that effective detection of IE invalid values can be achieved by using the string graph structure and string feature-based methods. The contribution of our work is to reduce validation cost and enhance the quality of IE results, even provide both analytical and empirical evidences for supporting the effective enhancement of IE results usability as well.
en_US
dc.description.tableofcontents 第1章 緒論 1
1.1. 研究背景 1
1.2. 研究動機與目的 2
1.3. 研究方法 3
1.4. 論文架構 4
第2章 文獻探討 5
2.1. 資訊擷取 5
2.1.1 資訊擷取的定義 5
2.1.2 資訊擷取的方法 6
2.1.3 資訊擷取的技術 7
2.2. 資料品質 8
2.2.1 資料品質的定義 9
2.2.2 資料品質的構面 9
2.3. 資料清理 10
2.3.1 資料清理的定義 11
2.3.2 資料清理的相關技術 11
2.4 圖形結構 (Graph Structure) 14
2.5 分類分析法 (Classification Analysis) 14
2.5.1 ID3 決策樹歸納法 14
2.5.2 C4.5決策樹歸納法 15
2.5.3 支持向量機 (Support Vector Machine) 16
2.5.4 多專家分類器 17
2.6 小結 18
第3章 錯誤資料偵測方法 20
3.1 需求分析 20
3.1.1 錯誤分析 20
3.1.2 資訊擷取成果的異常問題 21
3.1.3 資料偵測模型 22
3.2 字串圖形結構偵測方法 22
3.2.1 建構字串圖形結構器 23
3.2.2 偵測錯誤資料規則模型 27
3.2.3 資料推論器 27
3.3 字串特徵值偵測方法 34
3.3.1 擷取字串特徵器 35
3.3.2 字串特徵轉換器 37
3.3.3 C4.5演算法 41
3.3.4 SVM演算法 46
3.3.5 資料推論器 46
第4章 實驗分析討論與方法應用 47
4.1 實驗測試資料 47
4.2 實驗評估方法 48
4.3 相關資料清理技術之分析 50
4.4 實驗設計與實驗結果討論 54
4.4.1 字串圖形方法實驗架構 54
4.4.1.1 圖形結構節點比對方式之實驗結果 55
4.4.1.2 訓練資料筆數之實驗結果 61
4.4.1.3 目標資料年份數之實驗結果 64
4.4.1.4 字串圖形方法參數之實驗結果 66
4.4.1.5字串圖形結構之實驗小結 70
4.4.2 字串特徵值方法實驗架構 71
4.4.2.1 字串特徵轉換器之實驗結果 72
4.4.2.2訓練及目標資料年份數之實驗結果 73
4.4.2.3字串特徵方法參數之實驗結果 74
4.4.2.4訓練資料筆數之實驗結果 77
4.4.2.5字串特徵值之實驗小結 83
4.4.3 總體實驗結果討論 88
4.5 錯誤偵測方法應用 89
4.6 總結 92
第5章 結論與未來研究方向 94
5.1 結論 94
5.2 未來研究方向 96
參考文獻 99
附錄A 103
附錄B 116
zh_TW
dc.format.extent 48326 bytes-
dc.format.extent 67750 bytes-
dc.format.extent 94490 bytes-
dc.format.extent 262748 bytes-
dc.format.extent 111314 bytes-
dc.format.extent 187085 bytes-
dc.format.extent 254742 bytes-
dc.format.extent 537057 bytes-
dc.format.extent 133035 bytes-
dc.format.extent 83723 bytes-
dc.format.extent 345366 bytes-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0093753006en_US
dc.subject (關鍵詞) 錯誤偵測zh_TW
dc.subject (關鍵詞) 資訊擷取zh_TW
dc.subject (關鍵詞) 文本資料描述zh_TW
dc.subject (關鍵詞) Error Detectionen_US
dc.subject (關鍵詞) Information Extractionen_US
dc.subject (關鍵詞) Textual Data Profilingen_US
dc.title (題名) 中文資訊擷取結果之錯誤偵測zh_TW
dc.title (題名) Error Detection on Chinese Information Extraction Resultsen_US
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) [1] Paulson, L. D., “Data Quality: a Rising e-Business Concern,” IT Professional, Vol. 2 No. 4, July-Aug. 2000, pp.10–14.zh_TW
dc.relation.reference (參考文獻) [2] Rahm, E. and Do, H.-H., “Data Cleaning: Problems and Current Approaches,” IEEE Bulletin of the Technical Committee on Data Engineering, Vol. 23, No. 4, December 2000.zh_TW
dc.relation.reference (參考文獻) [3] 翁家緯,“以型態辨識為主的中文資訊擷取技術研究”,國立政治大學資訊科學系碩士論文,2003。zh_TW
dc.relation.reference (參考文獻) [4] Message Understanding Conference, URL: http://www.muc.saic.comzh_TW
dc.relation.reference (參考文獻) [5] Text Retrieval Conference, URL: http://trec.nist.govzh_TW
dc.relation.reference (參考文獻) [6] Jim Cowie, Wendy Lehnert. 1996. Information Extraction, Communications of the ACM(CACM), 39(1),pp.80-91zh_TW
dc.relation.reference (參考文獻) [7] Applet, D. E. and Israel, D.J. 1999. Introduction to Information extraction Technology. In Proceedings of the 16th International Joint Conference on Artificial Intelligence.zh_TW
dc.relation.reference (參考文獻) [8] Peng, F. Models Development in IE tasks – A survey. 1999. CS685 (Intelligent Computer Interface) course project, Computer Science Department, University of Waterloo.zh_TW
dc.relation.reference (參考文獻) [9] Ellen Riloff. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. Proceeding for the Eleventh National Conference on Artificial Intelligence, pp.811-816.zh_TW
dc.relation.reference (參考文獻) [10] Ellen Riloff. 1996. Automatically Generating Extraction Patterns from Untagged Text. In Proceedings of the Thriteenth National Conference on Artificial Intelligence, pp.1044-1049.zh_TW
dc.relation.reference (參考文獻) [11] Califf, M. E. and Mooney R.J. 1999. Relational Learning of Pattern- match Rules for Information Extraction. In Proceedings of the 16th National Conference on AI, pp.328-334.zh_TW
dc.relation.reference (參考文獻) [12] Kushmerick, N. Weld, D. and Doorenbos, R. 1997. Wrapper Induction for information extraction. In Proceedings of the 15th International Joint Conference on AI (IJCAI-97), pp. 729-737.zh_TW
dc.relation.reference (參考文獻) [13] Kushmerick, N. 1998. Wrapper Induction: Efficiency and Expressiveness. Workshop on AI & Information Integration. In Proceedings of AAAI-98 Workshop on Artification Intelligence and Information Integration, pp. 15-68, AAAI Press, Menlo Park, California.zh_TW
dc.relation.reference (參考文獻) [14] Chun-Nan Hsu and Ming-Tzung Dung. Aug 1998. Generating Finite-State Transducers for Semi-Structured Data Extraction from The Web, Journal of Infromation Systems, Special Issue on Semi-structured Data, Vol.23, No.8, pp. 521-538.zh_TW
dc.relation.reference (參考文獻) [15] Chun-Nan Hsu and Chien-Chi Chang. 1999. Finite-state Transducers for Semi-structured Text Mining. In Proceedings of IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Applications, pp. 38-49, Stockholm, Sweden.zh_TW
dc.relation.reference (參考文獻) [16] Jyi-Shane Liu, Mu-Hsi. Tseng. November 2001. Extracting Government Personnel Information from Official Gazettes. In Proceedings of the Sixth Conference on Artificial Intelligence and Applications, pp. 593-598, Kaoshiung, Taiwan.zh_TW
dc.relation.reference (參考文獻) [17] Oman, R. C. and Ayers, T. B. “Improving Data Quality,” Journal of Systems management, May 1988, pp.31-35.zh_TW
dc.relation.reference (參考文獻) [18] Tayi, G. K. and Ballou, D. P. “Examining Data Quality,” Communications of the ACM (41:2), Feb. 1998, pp.54-57.zh_TW
dc.relation.reference (參考文獻) [19] Ballou, D. P. and Pazer, H. L. “Implication of Data Quality for Spreadsheet Analysis,” Data Base, Spr. 1987, pp.13-19.zh_TW
dc.relation.reference (參考文獻) [20] Redman, T.C. Data Quality for the Information Age, Artech House, Inc., 1996. Redman, T.C. “The Impact of Poor Data Quality on the Typical Enterprise,” Communications of the ACM (41:2), Feb. 1998, pp.79-82.zh_TW
dc.relation.reference (參考文獻) [21] Brauer, B., “Data Quality –Spinning Straw Into Gold,” Available [Online] at: http://www2.sas.com/proceedings/sugi26/p117-26.pdf, 2000.zh_TW
dc.relation.reference (參考文獻) [22] Muller, H., and Freytag, J. C. Problems, Methods, and Challenges in Comprehensive Data Cleansing. Technical Report HUB-IB-164, Humboldt University Berlin, 2003.zh_TW
dc.relation.reference (參考文獻) [23] V. Raman and J. M. Hellerstein, An Interactive Framework for Data Cleaning, UC Berkeley Computer Science Division Report No. UCB/CSD00/1110, September 2000.zh_TW
dc.relation.reference (參考文獻) [24] H. Galhardas, D. Florescu and D. Shasha, An Extensible Framework for Data Cleaning, INRIA Technical Report, 1999.zh_TW
dc.relation.reference (參考文獻) [25] Kaufman, L. and Rousseeus, P. J., Finding Groups in Data: Anzh_TW
dc.relation.reference (參考文獻) introduction to Cluster Analysis, New York: John Wiley & Sons, 1990.zh_TW
dc.relation.reference (參考文獻) [26] 李念秋,“資料品質改善之研究:錯誤資料偵測技術之發展與評估”,國立中山大學資訊管理系碩士論文,2002。zh_TW
dc.relation.reference (參考文獻) [27] Quinlan, J. R., “Induction of Decision Tree,” Machine Learning, Vol. 1, 1986, pp.81-106.zh_TW
dc.relation.reference (參考文獻) [28] Quinlan, J. R., C4.5: Programs for Machine Learning, Morgen Kaufmann Publishers, San Mateo, CA, 1993.zh_TW
dc.relation.reference (參考文獻) [29] Chan, P. K., Fan, W., Prodromidis, A. L., and Stolfo, S. J.,“Distributed Data Mining in Credit Card Fraud Detection,” IEEE Intelligent Systems, Vol. 14, No. 6, 1999, pp.67-74.zh_TW
dc.relation.reference (參考文獻) [30] N.Cristianini, J. Shawf-Taylor. An Introduction to Support Vector Machines andzh_TW
dc.relation.reference (參考文獻) other kernel-based learning methods,Cambridge University Press,2000.zh_TW
dc.relation.reference (參考文獻) [31] V. Vapnik. Statistical Learning Theory. Wiley, 1998.zh_TW
dc.relation.reference (參考文獻) [32] Elmasri, R., and Navathe, S., Fundamentals Of Database Systems, 3rd edition , 2000.zh_TW
dc.relation.reference (參考文獻) [33] LIBSVM, http://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html, URL:http://www.csie.ntu.edu.tw/~r91034/svm/svm_tutorial.htmlzh_TW
dc.relation.reference (參考文獻) [34] Redman, T., Data Quality for the Information Age, Artech House, Boston, 1996.zh_TW
dc.relation.reference (參考文獻) [35] 總統府人事任免公報,URL:http://www.president.gov.tw/2_report/layer2.htmlzh_TW
dc.relation.reference (參考文獻) [36] Maletic, J.I. and Marcus, A., Data Cleansing: Beyond Integrity Analysis. Proceedings of the Conference on Information Quality (IQ2000), Boston, October 2000.zh_TW
dc.relation.reference (參考文獻) [37] 立法院新聞知識管理系統,URL: http://nplnews.ly.gov.tw/index.jspzh_TW