Please use this identifier to cite or link to this item: https://ah.lib.nccu.edu.tw/handle/140.119/35243
題名: 以學名結構為基礎之網路搜尋負載量模型設計
A Generic Construct based Workload Model for Web Search
作者: 柯怡芬
Ke, I Fen
貢獻者: 管郁君<br>諶家蘭
Huang, E.Y.<br>Seng, J.L.
柯怡芬
Ke, I Fen
關鍵詞: 網路搜尋
績效評估
負載量模型
學名結構
web search
benchmark
workload model
generic construct
performance
evaluation
日期: 2006
上傳時間: 18-Sep-2009
摘要: 網際網路搜尋是很重要的工具,可用以蒐集或尋找資訊。然而搜尋結果有時無法完全符合使用者的原意,所以網際網路搜尋引擎公司致力於發展更好的搜尋演算法,是為了增進搜尋結果的準確性並提高使用者對搜尋引擎的使用率,我們從探討的文獻中發現目前並沒有一個較彈性、開放的工具來評量網路搜尋的效能。本研究的目的就是希望能發展出一個較具彈性的負載量模型以針對網路搜尋進行效能評量。本研究著重在效能評量的負載量模型及測試套組的設計,我們希望透過以學名結構為基礎的方法擴展負載量模型的彈性,我們蒐集及研討幾個具代表性的網路搜尋演算法,並找出這些主要演算法的學名結構,以這些學名結構為基礎進行負載量模型的設計,負載量模型包含網頁模型、查詢模型與控制模型。最後,我們利用雛形實作來驗證本研究所提出的研究方法。
Web search service is a vital way to find information on the web. However, not every piece of information found is relevant or useful. In order to improve search accuracy, most designers of the web search engines devote to working on search algorithms development and optimization. From literature, we realize that there are few open or flexible performance evaluation methods for web search service. The objective of this research is to develop a more flexible workload model based on generic construct for web search benchmarking and build an automated benchmarking environment of performance evaluation. Generic constructs are major components which can represent the web search algorithm. We collect and review literature related to web search algorithms and benchmarking. And we identify the generic constructs of key web search algorithms. The workload model consists of a page model, query model and control model. The page model describes the web page structure in web search. The query model defines some important criteria to query the web search engines. The control model defines the variables that used to set up the benchmark environment. Finally, we validate the research model through the prototype implementation.
參考文獻: [1]. Bharat, K., & Henzinger, M. R. (1998). Improved algorithms for topic distillation in a hyperlinked environment. SIGIR `98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. 104-111. from http://doi.acm.org/10.1145/290941.290972
[2]. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. [Electronic version]. Computer Networks & ISDN Systems, 30, 107-118.
[3]. Can, F., Nuray, R., & Sevdik, A. B. (2004). Automatic performance evaluation of web search engines. [Electronic version]. Information Processing and Management, 40(3, May, 2004), 495-514.
[4]. Chidlovskii, B., Roustant, B., & Brette, M. (2006). Documentum ECI self-repairing wrappers: Performance analysis. SIGMOD `06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, IL, USA. 708-717. from http://doi.acm.org/10.1145/1142473.1142555
[5]. C. J., van Rijsbergen. Information retrieval (online book)., 2006 from http://www.dcs.gla.ac.uk/Keith/Preface.html
[6]. Clarke, S., & Willett, P. (1997). Estimating the recall performance of search engines. ASLIB Proceedings, 49 (7), 184-189.
[7]. David, H., Nick, C., Peter, B., & Kathleen, G. (2001). Measuring search engine quality. [Electronic version]. Information Retrieval, 4(1), 33-33.
[8]. Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction
[9]. Jansen, B. J., & Spink, A. (2006). How are we searching the world wide web? A comparison of nine search engine transaction logs. [Electronic version]. Information Processing and Management, 1, January, 2006(42), 248-263.
[10]. Ji-Rong , W., Ruihua, S., Deng, C., Kaihua, Z., Sphipeng, Y., & Shaozhi, Y., et al. (2003). MICROSOFT RESERACH ASIA AT THE WEB TRACK OF TREC 2003. Paper presented at the Text Retrieval Conference 2003, 408-408.
[11]. Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. J.ACM, 46(5), 604-632.
[12]. Kraaij, W., Westerveld, T., & Hiemstra, D. (2002). The importance of prior probabilities for entry page search. SIGIR `02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland. 27-34. from http://doi.acm.org/10.1145/564376.564383
[13]. Lawrence, P., Sergey, B., Rajeev, M., & Terry, W. (1998). The PageRank citation ranking: Bringing order to the web, Stanford Digital Libraries Working Paper.
[14]. Li, L., Shang, Y., & Zhang, W. (2002). Improvement of HITS-based algorithms on web documents. WWW `02: Proceedings of the 11th International Conference on World Wide Web, Honolulu, Hawaii, USA. 527-535. from http://doi.acm.org/10.1145/511446.511514
[15]. Nick, C., & David , H. (2004). Overview of the TREC-2004 web track. Paper presented at the Text Retrieval Conference 2004.
[16]. Pant, G. (2003). Deriving link-context from HTML tag tree. DMKD `03: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, San Diego, California. 49-55. from http://doi.acm.org/10.1145/882082.882094
[17]. Qin, T., Liu, T., Zhang, X., Feng, G., Wang, D., & Ma, W. (2007). Topic distillation via sub-site retrieval. [Electronic version]. Information Processing and Management, 43(2, March, 2007), 445-460.
[18]. Richard, J. Measuring search effectiveness., 2006, from http://www.hsl.creighton.edu/hsl/Searching/Recall-Precision.html
[19]. S E , R., & S, W. (1999). Okapi/Keenbow at TREC-8. Paper presented at the The Eighth Text Retrieval Conference (TREC 8), 151-162.
[20]. S E, R., & K , S. J. (1976). Relevance weighting of search terms. [Electronic version]. Journal of the American Society for Information Science, 27(May-June), 129-146.
[21]. Scarpa, M., Puliafito, A., Villari, M., & Zaia, A. (2004). A modeling technique for the performance analysis of web searching applications. IEEE Transactions on Knowledge and Data Engineering, 16(11), 1339-1356.
[22]. Shafi, S. M., & Rather, R. A. (2005). \"Precision and Recall of Five Search Engines for Retrieval of Scholarly Information in the Field of Biotechnology.\" Webology, 2 (2), Article 12. Available at: http://www.webology.ir/2005/v2n2/a12.html
[23]. Stephen, R. (2002). Threshold setting and performance optimization
in adaptive filtering. [Electronic version]. Information Retrieval, 5(2-3), 239-239.
[24]. Vapnik, V. N. (1998). Statistical learning theory Willey.
[25]. Vaughan, L. (2004). New measurements for search engine evaluation proposed and tested. [Electronic version]. Information Processing and Management, 40(4, July, 2004), 677-691.
描述: 碩士
國立政治大學
資訊管理研究所
94356017
95
資料來源: http://thesis.lib.nccu.edu.tw/record/#G0094356017
資料類型: thesis
DOI: http://dx.doi.org/10.1016/j.ipm.2009.04.004
Appears in Collections:學位論文

Files in This Item:
File Description SizeFormat
601701.pdf133.49 kBAdobe PDF2View/Open
601702.pdf140.12 kBAdobe PDF2View/Open
601703.pdf131.62 kBAdobe PDF2View/Open
601704.pdf129.54 kBAdobe PDF2View/Open
601705.pdf70.88 kBAdobe PDF2View/Open
601706.pdf188.42 kBAdobe PDF2View/Open
601707.pdf110.95 kBAdobe PDF2View/Open
601708.pdf189.49 kBAdobe PDF2View/Open
601709.pdf1.3 MBAdobe PDF2View/Open
601710.pdf77.44 kBAdobe PDF2View/Open
601711.pdf88.13 kBAdobe PDF2View/Open
Show full item record

Google ScholarTM

Check

Altmetric

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.