dc.contributor.advisor | 沈錳坤 | zh_TW |
dc.contributor.advisor | Shan, Man-Kwan | en_US |
dc.contributor.author (Authors) | 周大鈞 | zh_TW |
dc.contributor.author (Authors) | Chou, Ta-Chun | en_US |
dc.creator (作者) | 周大鈞 | zh_TW |
dc.creator (作者) | Chou, Ta-Chun | en_US |
dc.date (日期) | 2004 | en_US |
dc.date.accessioned | 17-Sep-2009 14:06:05 (UTC+8) | - |
dc.date.available | 17-Sep-2009 14:06:05 (UTC+8) | - |
dc.date.issued (上傳時間) | 17-Sep-2009 14:06:05 (UTC+8) | - |
dc.identifier (Other Identifiers) | G0917530031 | en_US |
dc.identifier.uri (URI) | https://nccur.lib.nccu.edu.tw/handle/140.119/32706 | - |
dc.description (描述) | 碩士 | zh_TW |
dc.description (描述) | 國立政治大學 | zh_TW |
dc.description (描述) | 資訊科學學系 | zh_TW |
dc.description (描述) | 91753003 | zh_TW |
dc.description (描述) | 93 | zh_TW |
dc.description.abstract (摘要) | 對學者而言,其研究主題的本體論資訊,包括有影響力的論文、有影響力的會議、有影響力的期刊和有影響力的研究者等資料,是學術研究的重要資訊。利用研究主題的本體論資訊,就能對該領域有大致的瞭解。因此本研究的目的,就是針對特定研究主題,自動的從WWW中,探勘出此主題的本體論資訊,包含此主題中有影響力的論文、作者、會議和期刊。我們先從WWW上透過CiteSeer取得與主題相關的學術論文,再透過引用關係去擴充論文集合。由這些論文中利用資訊萃取的技術,找出論文出處和作者。接著分別根據引用關係分析論文、會議、期刊和作者的影響力指標,我們也考慮論文、會議、期刊和作者之間的mutual reinforcing relation,修改Webpage Ranking Algorithms,來幫助計算由論文引用關係所得的影響力指標。我們實做出系統,提供使用者查詢特定研究主題的本體論資訊,並且找出相關學者、期刊、論文的網站。我們請具有該研究主題專長的學者,評估系統的效果,得出將近60%的準確率。 | zh_TW |
dc.description.abstract (摘要) | Ontological information of research topic, that includes influential papers, conferences, journals, and authors, is the important information of academic research for researchers. Ontological information gives an overview of specific research topic for researchers. Our research is to discover the ontological information of specific research topic from WWW. Firstly, we collect papers that related to specific research topic. These papers are collected by querying CiteSeer. The dataset of papers is extended by citation information further. Then, the metadata of these papers is extracted by Information Extraction technique. After analyzing the influence of papers, conferences, journals, and authors individually based on citation analysis, the influence between them will be considered mutually. We modify the Webpage Ranking Algorithms to be adapted in our research for mutual reinforcing relation analysis.We implemented a system that offers users the ontological information of specific research topic after querying from this system. And discover the website of related authors, conferences, and journals. The results evaluated by experts in specific topic are near sixty percent correct. | en_US |
dc.description.tableofcontents | 第一章 簡介 1第二章 相關研究 52.1 本體論 5 2.2 資訊計量學 7 2.3 網頁排序演算法 10第三章 研究議題 15 3.1 Properties for Definition 15 3.2 Mutual Reinforcing Relation 173.3 資料集合 18 3.4 資訊萃取 19第四章 Proposed Approaches 22 4.1資料集合收集 24 4.1.1 Base Set Finding 26 4.1.2 Base Set Extension 294.2 資訊萃取 29 4.2.1 論文出處辨認 31 4.2.2 論文作者辨認 36 4.2.3 輔助資料 37 4.3 Paper-based Ranking 384.4 改進方式 404.4.1 Improved Base Set Extension 404.4.2 Improved Paper-based Ranking 414.5 Website Locating 46第五章 系統架構與實做 49 5.1 系統架構 495.2 系統實做 535.2.1 Query Processor 545.2.2 DataSet Preparor 545.2.3 Ranker 555.2.4 Website Locator 565.2.5 Clustering Module 575.2.6 Crawler 575.2.7 PreProcessor 695.2.8 Database 705.2.9 Free Software Packages 73第六章 實驗設計與結果評估 766.1實驗設計與評估方式 766.2結果評估 776.2.1 Converge 786.2.2 Fixed Dataset 796.2.3 Personal Website Identification 826.2.4 Summary of Experiments 82第七章 結論與討論 867.1 結論 867.2 討論 876.2 未來研究 88參考文獻 90 | zh_TW |
dc.format.extent | 80391 bytes | - |
dc.format.extent | 98693 bytes | - |
dc.format.extent | 112077 bytes | - |
dc.format.extent | 89689 bytes | - |
dc.format.extent | 88337 bytes | - |
dc.format.extent | 80643 bytes | - |
dc.format.extent | 151458 bytes | - |
dc.format.extent | 221309 bytes | - |
dc.format.extent | 175631 bytes | - |
dc.format.extent | 329017 bytes | - |
dc.format.extent | 2461647 bytes | - |
dc.format.extent | 202161 bytes | - |
dc.format.extent | 124806 bytes | - |
dc.format.extent | 94544 bytes | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.format.mimetype | application/pdf | - |
dc.language.iso | en_US | - |
dc.source.uri (資料來源) | http://thesis.lib.nccu.edu.tw/record/#G0917530031 | en_US |
dc.subject (關鍵詞) | 本體論 | zh_TW |
dc.subject (關鍵詞) | 學術研究 | zh_TW |
dc.subject (關鍵詞) | 全球資訊網 | zh_TW |
dc.subject (關鍵詞) | 資料探勘 | zh_TW |
dc.subject (關鍵詞) | Ontology | en_US |
dc.subject (關鍵詞) | Academic research | en_US |
dc.subject (關鍵詞) | WWW | en_US |
dc.subject (關鍵詞) | Data mining | en_US |
dc.title (題名) | 由全球資訊網探勘學術研究領域的本體論資訊 | zh_TW |
dc.type (資料類型) | thesis | en |
dc.relation.reference (參考文獻) | About CiteSeer,” http://citeseer.ist.psu.edu/citeseer.html | zh_TW |
dc.relation.reference (參考文獻) | [2] Benjamins, V. R., Fensel, D., Decker, S. and Sauncion, G. P., “(KA)2: Building Ontologies for the Internet: A Midterm Report,” International Journal of Human-Computer Studies, 51(3), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [3] Berners-Lee, T., Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor, HarperCollins Publishers, New York, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [4] Bharat, K., Chang, B. W., Henzinger, M. and Ruhl, M., “Who Links to Whom: Mining Linkage between Web Sites,” Proc. of the First IEEE International Conference on Data Mining, 2001. | zh_TW |
dc.relation.reference (參考文獻) | [5] “Bit::Vector - Efficient Bit Vector, Set of Integers and Big Int Math Library,” http://www.engelschall.com/~sb/download/Bit-Vector/c | zh_TW |
dc.relation.reference (參考文獻) | [6] Bollacker, K. D., Lawrence, S. and Giles C. L., “CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications,” Proc. of the Second International conference on Autonomous Agents, 1998. | zh_TW |
dc.relation.reference (參考文獻) | [7] Bollacker, K. D., Lawrence, S. and Giles, C. L., “A System For Automatic Personalized Tracking of Scientific Literature on the Web,” Proc. of the fourth ACM Conference on Digital Libraries, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [8] Bollacker, K. D., Lawrence, S. and Giles, C. L., “Discovering Relevant Scientific Literature on The Web,” IEEE Intelligent Systems, 15(2), 2000. | zh_TW |
dc.relation.reference (參考文獻) | [9] Brin, S. and Page, L., “The Anatomy of a Large-scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems, 30(1-7), 1998. | zh_TW |
dc.relation.reference (參考文獻) | [10] Byrd, R. J. and Ravin, Y., “Identifying and Extracting Relations in Text,” Proc. of International Conference on Applications of Natural Language to Information Systems NLDB’99, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [11] Cbakrabarti, S., Dom, B. E., Kumar, S. R., Rajagopalan, S., Tomkins, A., Gibson, D. and Kleinberg, J., “Mining the Web`s Link Structure,” IEEE Computer, 32(8), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [12] Chakrabarti, S., “Recent Results in Automatic Web Resource Discovery,” ACM Computing Surveys, 31(4), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [13] Dore, J. C. and Ojasoo, T., “How to Analyze Publication Time Trends by Correspondence Factor Analysis: Analysis of Publications by 48 Countries in 18 Disciplines over 12 years,” Journal of the American Society for Information Science, 52(9), 2001. | zh_TW |
dc.relation.reference (參考文獻) | [14] Efe, K., Raghavan, V., Chu, C. H., Broadwater, A. L., Bolelli, L. and Ertekin, S., “The Shape of the Web and Its Implications for Searching the Web,” Proc. of the International Conference on the Advances in Infrastructure for Electronic Business, Science, and Education on the Internet, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [15] Faure, D. and Nedellec, C., “A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology,” Proc. of LREC Workshop on Adapting Lexical and Corpus Resources to Sublanguages and Applications, 1998. | zh_TW |
dc.relation.reference (參考文獻) | [16] Getoor, L., ”Link Mining: A New Data Mining Challenge,” SIGKDD Explorations, 4(2), 2003. | zh_TW |
dc.relation.reference (參考文獻) | [17] Giles, C. L., Bollacker, K. D. and Lawrence S., ”CiteSeer: An Automatic Citation Indexing System,” Proc. of the Third ACM Conference on Digital Libraries, 1998. | zh_TW |
dc.relation.reference (參考文獻) | [18] Gomez-Perez, A., Fernandez-Lopez, M. and Corcho, O., Ontological Engineering: with Examples from the Areas of Knowledge Management, E-Commerce and the Semantic Web, Springer-Verlag, 2002. | zh_TW |
dc.relation.reference (參考文獻) | [19] Gomez-Perez, A. and Manzano-Macho, D., “A Survey of Ontology Learning Methods and Techniques,” Technical Report, Institute of Computer Science, Leopold Franzens University of Innsbruck, 2003. | zh_TW |
dc.relation.reference (參考文獻) | [20] Gomez-Perez, A. and Benjamines, V. R., “Overview of Knowledge Sharing and Reuse Components: Ontologies and Problem-solving methods,” Proc. of the Sixteenth International Joint Conference on Artificial Intelligence Workshop on Ontologies and Problem-Solving, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [21] ”Journal Selection Process,” http://www.isinet.com/selection/ | zh_TW |
dc.relation.reference (參考文獻) | [22] Kleinberg, J. M., “Hubs, Authorities, and Communities,” ACM Computing Surveys, 31(4), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [23] Kleinberg, J. M., “Authoritative Source in a Hyperlinked Environment,” Journal of the ACM, 46(5), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [24] Kostoff, R. N., Rio, J. A. D., Humenik, J. A., Garcia, E. O. and Ramirez, A. M., “Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling,” Journal of the American Society for Information Science, 52(13), 2001. | zh_TW |
dc.relation.reference (參考文獻) | [25] Larson, R. R., “Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace,” Proc. American Society for Information Science and Technology ASIS 96, 1996. | zh_TW |
dc.relation.reference (參考文獻) | [26] Lawrence, S., Giles, C. L. and Bollacker, K., “Digital Libraries and Autonomous Citation Indexing,” IEEE Computer, 32(6), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [27] Lawrence, S., Bollacker, K. and Giles, C. L., “Indexing and Retrieval of Scientific Literature,” Proc. Eighth International Conference on Information and Knowledge Management CIKM 99, 1999. | zh_TW |
dc.relation.reference (參考文獻) | [28] Lawrie, D. and Croft, W.B., “Discovering and Comparing Topic Hierarchies,” Proc. of RIAO 2000 Conference, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [29] Lempel, R. and Moran, S., “The Stochastic Approach for Link-Structure Analysis (SALSA) and the TKC Effect,” Proc. of the Ninth International World Wide Web Conference, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [30] Maedche, A. and Staab, S., “Discovering Conceptual Relations from Text,“ Proc. of European Conference on Artificial Intelligence ECAI’00, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [31] Maedche, A. and Staab, S., “Ontology Learning for the Semantic Web,“ IEEE Intelligent Systems, 16(2), 2001. | zh_TW |
dc.relation.reference (參考文獻) | [32] McGovern, A., Friedland, L., Hay, M., Gallagher, B. and Fast, A., “Exploiting Relational Structure to Understand Publication Patterns in High-Energy Physics,“ SIGKDD Explorations, 5(2), 2003. | zh_TW |
dc.relation.reference (參考文獻) | [33] Page, L., Brin, S., Motwani, R. and winograd, T., “The PageRank Citation Ranking: Bring Order to the Web,“ http://google.stanford,edu/~backrub/pageranksub.ps | zh_TW |
dc.relation.reference (參考文獻) | [34] Paul E. van der Vet, Nicolaas J.I. Mars., “Bottom-Up Construction of Ontologies,“ IEEE Transactions on Knowledge and Data Engineering, 10(4), 1998. | zh_TW |
dc.relation.reference (參考文獻) | [35] Popescul, A., Flake, G. W., Lawrence, S., Ungar, L. H. and Giles, C. L., “Clustering and Identifying Temporal Trends in Document Databases,“ Proc. of the Fifth IEEE Advances in Digital Libraries, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [36] Rafiei, D. and Mendelzon, A. O., “What is this Page Known for? Computing Web Page Reputations,“ Proc. of the Ninth International World Wide Web Conference, 2000. | zh_TW |
dc.relation.reference (參考文獻) | [37] “search.cpan.org: LWP-The World Wide Web library for Perl,“ http://search.cpan.org/~gaas/libwww-perl-5.76/lib/LWP.pm | zh_TW |
dc.relation.reference (參考文獻) | [38] “search.cpan.org: Math::Cephes::Matrix - Perl interface to the cephes matrix routines,“ http://search.cpan.org/~rkobes/Math-Cephes-0.42/lib/Math/Cephes/Matrix.pm | zh_TW |
dc.relation.reference (參考文獻) | [39] Shih, F. M., Discovering Ontological Information from the On-line Publications, Master Thesis, Institute of Computer and Information Science, National Chiao tung University, 2003. | zh_TW |
dc.relation.reference (參考文獻) | [40] Shun, S. B., Motta, E. and Dpminigue, J., “ScholOnto: An Ontology-Based Digital Library Server for Research Documents and Discourse,“ International Journal on Digital Libraries, 3(3), 2000. | zh_TW |
dc.relation.reference (參考文獻) | [41] Small, H., “Visualizing Science by Citation Mapping,“ Journal of the American Society for Information Science, 50(9), 1999. | zh_TW |
dc.relation.reference (參考文獻) | [42] Suryanto, H. and Compton, P., “Discovery of Ontologies from Knowledge Bases,“ Proc. of the First International Conference on Knowledge Capture, New York, USA, 2001. | zh_TW |
dc.relation.reference (參考文獻) | [43] Vaughan, L. and Shaw, D., “Bibliographic and Web Citations: What Is the Difference?“ Journal of the American Society for Information Science, 54(14), 2003. | zh_TW |
dc.relation.reference (參考文獻) | [44] White, H. D., “Author Cocitation Analysis and Pearson’s r,“ Journal of the American Society for Information Science, 54(13), 2003. | zh_TW |
dc.relation.reference (參考文獻) | [45] White, H. D., “Pathfinder Networks and Author Cocitation Analysis: A Remapping of Paradigmatic Information Scientists,“ Journal of the American Society for Information Science, 54(5), 2003. | zh_TW |
dc.relation.reference (參考文獻) | [46] Yaru, D., “Brief Communication Structural Modeling of Network Systems in Citation Analysis,“ Journal of the American Society for Information Science, 48(10), 1997. | zh_TW |
dc.relation.reference (參考文獻) | [47] Yu, P. S., Li, X. and Liu, B., “On the Temporal Dimension of Search,“ Proc. of the Thirteenth International World Wide Web Conference, 2004. | zh_TW |
dc.relation.reference (參考文獻) | [48] Zha, H., “Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering,“ Proc. of the twenty-fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002. | zh_TW |
dc.relation.reference (參考文獻) | [49] 蔡明月, 資訊計量學與文獻特性, 國立編譯館, 2003. | zh_TW |