Please use this identifier to cite or link to this item:
https://ah.lib.nccu.edu.tw/handle/140.119/35243
題名: | 以學名結構為基礎之網路搜尋負載量模型設計 A Generic Construct based Workload Model for Web Search |
作者: | 柯怡芬 Ke, I Fen |
貢獻者: | 管郁君<br>諶家蘭 Huang, E.Y.<br>Seng, J.L. 柯怡芬 Ke, I Fen |
關鍵詞: | 網路搜尋 績效評估 負載量模型 學名結構 web search benchmark workload model generic construct performance evaluation |
日期: | 2006 | 上傳時間: | 18-Sep-2009 | 摘要: | 網際網路搜尋是很重要的工具,可用以蒐集或尋找資訊。然而搜尋結果有時無法完全符合使用者的原意,所以網際網路搜尋引擎公司致力於發展更好的搜尋演算法,是為了增進搜尋結果的準確性並提高使用者對搜尋引擎的使用率,我們從探討的文獻中發現目前並沒有一個較彈性、開放的工具來評量網路搜尋的效能。本研究的目的就是希望能發展出一個較具彈性的負載量模型以針對網路搜尋進行效能評量。本研究著重在效能評量的負載量模型及測試套組的設計,我們希望透過以學名結構為基礎的方法擴展負載量模型的彈性,我們蒐集及研討幾個具代表性的網路搜尋演算法,並找出這些主要演算法的學名結構,以這些學名結構為基礎進行負載量模型的設計,負載量模型包含網頁模型、查詢模型與控制模型。最後,我們利用雛形實作來驗證本研究所提出的研究方法。 Web search service is a vital way to find information on the web. However, not every piece of information found is relevant or useful. In order to improve search accuracy, most designers of the web search engines devote to working on search algorithms development and optimization. From literature, we realize that there are few open or flexible performance evaluation methods for web search service. The objective of this research is to develop a more flexible workload model based on generic construct for web search benchmarking and build an automated benchmarking environment of performance evaluation. Generic constructs are major components which can represent the web search algorithm. We collect and review literature related to web search algorithms and benchmarking. And we identify the generic constructs of key web search algorithms. The workload model consists of a page model, query model and control model. The page model describes the web page structure in web search. The query model defines some important criteria to query the web search engines. The control model defines the variables that used to set up the benchmark environment. Finally, we validate the research model through the prototype implementation. |
參考文獻: | [1]. Bharat, K., & Henzinger, M. R. (1998). Improved algorithms for topic distillation in a hyperlinked environment. SIGIR `98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. 104-111. from http://doi.acm.org/10.1145/290941.290972 [2]. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. [Electronic version]. Computer Networks & ISDN Systems, 30, 107-118. [3]. Can, F., Nuray, R., & Sevdik, A. B. (2004). Automatic performance evaluation of web search engines. [Electronic version]. Information Processing and Management, 40(3, May, 2004), 495-514. [4]. Chidlovskii, B., Roustant, B., & Brette, M. (2006). Documentum ECI self-repairing wrappers: Performance analysis. SIGMOD `06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, IL, USA. 708-717. from http://doi.acm.org/10.1145/1142473.1142555 [5]. C. J., van Rijsbergen. Information retrieval (online book)., 2006 from http://www.dcs.gla.ac.uk/Keith/Preface.html [6]. Clarke, S., & Willett, P. (1997). Estimating the recall performance of search engines. ASLIB Proceedings, 49 (7), 184-189. [7]. David, H., Nick, C., Peter, B., & Kathleen, G. (2001). Measuring search engine quality. [Electronic version]. Information Retrieval, 4(1), 33-33. [8]. Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction [9]. Jansen, B. J., & Spink, A. (2006). How are we searching the world wide web? A comparison of nine search engine transaction logs. [Electronic version]. Information Processing and Management, 1, January, 2006(42), 248-263. [10]. Ji-Rong , W., Ruihua, S., Deng, C., Kaihua, Z., Sphipeng, Y., & Shaozhi, Y., et al. (2003). MICROSOFT RESERACH ASIA AT THE WEB TRACK OF TREC 2003. Paper presented at the Text Retrieval Conference 2003, 408-408. [11]. Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. J.ACM, 46(5), 604-632. [12]. Kraaij, W., Westerveld, T., & Hiemstra, D. (2002). The importance of prior probabilities for entry page search. SIGIR `02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland. 27-34. from http://doi.acm.org/10.1145/564376.564383 [13]. Lawrence, P., Sergey, B., Rajeev, M., & Terry, W. (1998). The PageRank citation ranking: Bringing order to the web, Stanford Digital Libraries Working Paper. [14]. Li, L., Shang, Y., & Zhang, W. (2002). Improvement of HITS-based algorithms on web documents. WWW `02: Proceedings of the 11th International Conference on World Wide Web, Honolulu, Hawaii, USA. 527-535. from http://doi.acm.org/10.1145/511446.511514 [15]. Nick, C., & David , H. (2004). Overview of the TREC-2004 web track. Paper presented at the Text Retrieval Conference 2004. [16]. Pant, G. (2003). Deriving link-context from HTML tag tree. DMKD `03: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, California. 49-55. from http://doi.acm.org/10.1145/882082.882094 [17]. Qin, T., Liu, T., Zhang, X., Feng, G., Wang, D., & Ma, W. (2007). Topic distillation via sub-site retrieval. [Electronic version]. Information Processing and Management, 43(2, March, 2007), 445-460. [18]. Richard, J. Measuring search effectiveness., 2006, from http://www.hsl.creighton.edu/hsl/Searching/Recall-Precision.html [19]. S E , R., & S, W. (1999). Okapi/Keenbow at TREC-8. Paper presented at the The Eighth Text Retrieval Conference (TREC 8), 151-162. [20]. S E, R., & K , S. J. (1976). Relevance weighting of search terms. [Electronic version]. Journal of the American Society for Information Science, 27(May-June), 129-146. [21]. Scarpa, M., Puliafito, A., Villari, M., & Zaia, A. (2004). A modeling technique for the performance analysis of web searching applications. IEEE Transactions on Knowledge and Data Engineering, 16(11), 1339-1356. [22]. Shafi, S. M., & Rather, R. A. (2005). \"Precision and Recall of Five Search Engines for Retrieval of Scholarly Information in the Field of Biotechnology.\" Webology, 2 (2), Article 12. Available at: http://www.webology.ir/2005/v2n2/a12.html [23]. Stephen, R. (2002). Threshold setting and performance optimization in adaptive filtering. [Electronic version]. Information Retrieval, 5(2-3), 239-239. [24]. Vapnik, V. N. (1998). Statistical learning theory Willey. [25]. Vaughan, L. (2004). New measurements for search engine evaluation proposed and tested. [Electronic version]. Information Processing and Management, 40(4, July, 2004), 677-691. |
描述: | 碩士 國立政治大學 資訊管理研究所 94356017 95 |
資料來源: | http://thesis.lib.nccu.edu.tw/record/#G0094356017 | 資料類型: | thesis | DOI: | http://dx.doi.org/10.1016/j.ipm.2009.04.004 |
Appears in Collections: | 學位論文 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
601701.pdf | 133.49 kB | Adobe PDF2 | View/Open | |
601702.pdf | 140.12 kB | Adobe PDF2 | View/Open | |
601703.pdf | 131.62 kB | Adobe PDF2 | View/Open | |
601704.pdf | 129.54 kB | Adobe PDF2 | View/Open | |
601705.pdf | 70.88 kB | Adobe PDF2 | View/Open | |
601706.pdf | 188.42 kB | Adobe PDF2 | View/Open | |
601707.pdf | 110.95 kB | Adobe PDF2 | View/Open | |
601708.pdf | 189.49 kB | Adobe PDF2 | View/Open | |
601709.pdf | 1.3 MB | Adobe PDF2 | View/Open | |
601710.pdf | 77.44 kB | Adobe PDF2 | View/Open | |
601711.pdf | 88.13 kB | Adobe PDF2 | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.