學術產出-學位論文

題名 全球資訊網中使用者網頁-動作路徑的資料挖掘
作者 林青峰
Lin , Qing-Fung
貢獻者 楊亨利
Yang , Herng-Li
林青峰
Lin , Qing-Fung
關鍵詞 全球資訊網
網站使用挖掘
資料挖掘
使用者行為
網頁路徑
動作路徑
World-Wide-Web
Web-Useage Mining
Data-Mining
User Behavior
Page Path
Action Path
日期 2003
上傳時間 18-九月-2009 14:27:26 (UTC+8)
摘要   客戶在從事消費時,往往會有許多不一樣的行為產生。對組織而言,研究客戶的消費行為能夠協助組織更了解客戶的資訊,進而支援其經營活動。以往與客戶行為相關的資料挖掘研究,較著重於客戶的消費資料。而對於客戶在商店中做了那些動作,及其動作會導致發生的事件並沒有較全盤及深入的討論。對實體業者而言,要實際的去記錄使用者在商店內的行為,是不太可行的;但相對的說,隨著網際網路與資料收集技術的發展,網站經營者應用log留存技術,將比傳統業者更容易且完整的收集到消費者行為記錄。本研究試圖在全球資訊網的環境中建立一個能夠同時分析使用者的瀏覽網頁路徑及其動作過程的演算法;並且配合該演算法建置一個雛形系統,以驗証其效能,最後並評估其日後實務操作的可行性。
Different kind of customer purchases with different behavior. Studying the customer’s purchase behavior can help organizations understand their client intentions to support their business activities. In the past, customer behavior data mining emphasized on their purchase items, i.e., what they buy. There was few studies discussing what path they took and what actions they made in an e-store. It is impossible for a physical store to record its customers’ all actions and passing paths. However, a website store can easily collect such data in an Internet log. This study proposes a data mining algorithm that can analyze both customers’ browsing pages and their actions path. The algorithm’s efficiency and feasibility were examined in our prototype. This study may contribute to help the website mangers to restructure their website layouts or advertisement position to catch the customer’s eyes.
參考文獻 1、陳仕昇、許秉瑜與陳彥良, "以可重複序列挖掘網路瀏覽規則之研究," 資訊管理學報,第六卷,第二期, 2000
2、Agrawal, R., Ghosh, S., Imielinski, T., Iyer B. and Swami, A., "An Interval Classifier for Database Mining Applications," Proceeding of th 18th VLDB Confrence, Vancouver, Canada, August, 1992, pp.560-573.
3、Agrawal, R., Imielinski, T. and Swami, A., "Database Mining: A Performance Perspective," IEEE Transactions on Knowledge, Vol.5, No.6, December, 1993, pp.914-925.
4、Agrawal, R. and Srikant, R., "Fast Algorithms for Mining Association Rules," Proc. of the 20th Int`l Conference on Very Large Databases, Santiago, Chile, Sep. 1994.
5、Agrawal, R. and Srikant, R., "Mining Sequential Patterns," Proc. of the Int`l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995.
6、Chen, M-S., Han, J. and Yu, P. S., "Data Mining:An Overview from a Database Perspective," IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, 1996, pp. 866-883.
7、Chen, M-S., Park J-S. and Yu, P. S., "Efficient Data Mining for Path Traversal Patterns," IEEE Trans. on Knowledge and Data Engineering, Vol. 10, No. 2, April 1998, pp. 209-221.
8、Cooley, R., Mobasher, B. and Srivastava, J., "Web Mining: Information and Pattern Discovery on the World Wide Web," in Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI`97), Nov. 1997.
9、Cooley, R., Mobasher, B. and Srivastava, J., "Grouping Web Page References into Transactions for Mining World Wide Web Browsing Patterns," Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97), Nov. 1997.
10、Cooley, R., Mobasher, B. and Srivastava, J., "Data Preparation for Mining World Wide Web Browsing Patterns," Journal of Knowledge and Information Systems, Vol. 1, No. 1, 1999.
11、Frawley, W.J., Paitetsky-Shapiro, G. and Matheus C. J., "Knowledge Discovery in Databases: An Overview," Knowledge Discovery in Databases, California, Edited by Paitetsky-Shapiro, G.. and Frawley,W.J., AAAI/MIT Express, 1991, pp.1-30.
12、Han, J., Cai, Y. and Cercone, N., "Attribute-Oriented Induction in Relational Databases," in G. Piatetsky-Shapiro and W. J. Frawley(eds.), Knowledge Discovery in Databases, AAAI/MIT Press, 1991, pp. 213-228.
13、Han, J., Cai, Y. and Cercone, N., "Knowledge Discovery in Databases : An Attribute-Oriented Approach," Proceeding of the 18th VLDB Conference, Canada, August, 1992, pp. 547-549.
14、Han, J. and Fu, Y., "Discovery of Multiple-Level Association Rules from Large Databases," Oric. of 1995 Int`1 Conf. on Very Large Data Bases (VLDB`95), Zuich, Switzerland, September 1995, pp. 420-431.
15、Han, J., Koperski, K. and Adhikary, J., "Spatial Data Mining: and Challenges," 1996 SIGMOD`96 Workshop. on Resarch Issues on Data Mining and Knowledge Discovery(DMKD`96), Montreal, Canada, June 1996.
16、Han, J., Yang, Q. and Kim, E., "Plan Mining by Divide-and-Conquer," Proc. 1999 SIGMOD`99 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD`99) , Philadelphia, PA, May 1999, pp. 8:1-8:6
17、Hsieh, C. C. and Chang, C.T., "An enhanced transaction identification module on web usage mining", Asia Pacific Management, pp.241~252, 2001.
18、Lewinson., "Data Mining:Intelligent Technology Gets Down To Business," PC AI, Vol7, Vol.6 November/December 1993, pp.16-23.
19、Pasquier, N., Bastide, Y., Taouil R. and Lakhal, L., "Efficient Mining Of Association Rules Using Closed Itemset Lattices," Information Systems, Vol. 24, No. 1, March 1999, pp. 25-46.
20、Quinlan, J. R., "Induction of Decion Trees," Machine Learning, Vol1, No.1, 1986, pp.81-106.
21、Ramkumar, G.D. and Swami, A., "Clustering data without distance functions", IEEE Technical committee on Data Engineering, Data(base) Engineering Bulletin, pp.9~14, 1998.
22、Savasere, A., Omiecinski, E. and Navathe, S., "An Efficient Algorithm for Mining Association Rules in Large Databases," Proc. Int`l Conf. Very Large Data Bases, Zurich, Switzerland, Sep. 1995, pp. 432-444.
23、Toivonen, H., "Sampling Large Databases For Association Rules," The 22th International Conference on Very Large Databases (VLDB`96), Mumbay, India, Sep. 1996, pp. 134-145.
24、Yu C-C., and Chen, Y-L., "Mining sequential patterns from multi-dimensional sequence data," revised in IEEE Trans. On Knowledge and Data engineering. (SCI) ,2002
25、Yun, C.H. and Chen, M.S., "Using pattern-join and purchase-combination for mining transaction patterns in an electronic commerce environment", The 24th Annual International Conference On Computer Software and Applications, Taipei, Taiwan, pp.99~104, 25-27 Oct 2000.
26、 Zhang, W., Xu, B., Song, W., Yung, H. and Liu, K., "Data mining algorithms for web pre-fetching", Proceeding of the First International Conference On Web Information Systems Engineering, Hong Kong, China, Vol.2 pp.34-38, 19-21 June 2000.
描述 碩士
國立政治大學
資訊管理研究所
91356030
92
資料來源 http://thesis.lib.nccu.edu.tw/record/#G0091356030
資料類型 thesis
dc.contributor.advisor 楊亨利zh_TW
dc.contributor.advisor Yang , Herng-Lien_US
dc.contributor.author (作者) 林青峰zh_TW
dc.contributor.author (作者) Lin , Qing-Fungen_US
dc.creator (作者) 林青峰zh_TW
dc.creator (作者) Lin , Qing-Fungen_US
dc.date (日期) 2003en_US
dc.date.accessioned 18-九月-2009 14:27:26 (UTC+8)-
dc.date.available 18-九月-2009 14:27:26 (UTC+8)-
dc.date.issued (上傳時間) 18-九月-2009 14:27:26 (UTC+8)-
dc.identifier (其他 識別碼) G0091356030en_US
dc.identifier.uri (URI) https://nccur.lib.nccu.edu.tw/handle/140.119/35211-
dc.description (描述) 碩士zh_TW
dc.description (描述) 國立政治大學zh_TW
dc.description (描述) 資訊管理研究所zh_TW
dc.description (描述) 91356030zh_TW
dc.description (描述) 92zh_TW
dc.description.abstract (摘要)   客戶在從事消費時,往往會有許多不一樣的行為產生。對組織而言,研究客戶的消費行為能夠協助組織更了解客戶的資訊,進而支援其經營活動。以往與客戶行為相關的資料挖掘研究,較著重於客戶的消費資料。而對於客戶在商店中做了那些動作,及其動作會導致發生的事件並沒有較全盤及深入的討論。對實體業者而言,要實際的去記錄使用者在商店內的行為,是不太可行的;但相對的說,隨著網際網路與資料收集技術的發展,網站經營者應用log留存技術,將比傳統業者更容易且完整的收集到消費者行為記錄。本研究試圖在全球資訊網的環境中建立一個能夠同時分析使用者的瀏覽網頁路徑及其動作過程的演算法;並且配合該演算法建置一個雛形系統,以驗証其效能,最後並評估其日後實務操作的可行性。zh_TW
dc.description.abstract (摘要) Different kind of customer purchases with different behavior. Studying the customer’s purchase behavior can help organizations understand their client intentions to support their business activities. In the past, customer behavior data mining emphasized on their purchase items, i.e., what they buy. There was few studies discussing what path they took and what actions they made in an e-store. It is impossible for a physical store to record its customers’ all actions and passing paths. However, a website store can easily collect such data in an Internet log. This study proposes a data mining algorithm that can analyze both customers’ browsing pages and their actions path. The algorithm’s efficiency and feasibility were examined in our prototype. This study may contribute to help the website mangers to restructure their website layouts or advertisement position to catch the customer’s eyes.en_US
dc.description.tableofcontents 中文摘要 ……………………………………………………………………………Ⅰ
英文摘要 ……………………………………………………………………………Ⅱ
謝詞 …………………………………………………………………………………Ⅲ
目錄 …………………………………………………………………………………Ⅳ
圖目錄 ………………………………………………………………………………Ⅵ
表目錄 ………………………………………………………………………………Ⅷ

第壹章 緒論 ……………………………………………………………………… 1
1.1 研究背景 ……………………………………………………………………1
1.2 研究動機 ……………………………………………………………………2
1.3 研究目的 ……………………………………………………………………3
1.4 研究流程與步驟 ……………………………………………………………3
1.5 研究範圍與研究限制 ………………………………………………………4
1.6 論文架構 ……………………………………………………………………5

第貳章 文獻探討 ………………………………………………………………… 7
2.1 資料挖掘(Data Mining) ………………………………………………… 7
2.1.1 資料挖掘的分類 …………………………………………………… 8
2.1.2 資料挖掘可找出規則種類 ………………………………………… 9
2.1.3 資料挖掘的演算法 …………………………………………………11
2.2 網路上的規則挖掘(Web Mining) …………………………………………14
2.2.1 網站使用狀況的挖掘(Web Usage Mining) ……………………… 15

第參章 網頁-動作路徑的資料挖掘演算法……………………………………… 20
3.1 網頁-動作路徑 ……………………………………………………………20
3.1.1 網頁-動作路徑的資料結構 …………………………………………20
3.1.2 網頁-動作路徑的Path與長度 …………………………………… 23
3.2 網頁-動作路徑演算法(Page-Action Algorithm) ………………………24
3.2.1 網頁-動作路徑的Pattern(P-A Pattern) ……………………… 25
3.2.2 網頁-動作路徑的Pattern掃描(P-A_ScanX) ………………………26
3.2.3 網頁-動作路徑的門檻值(Large Threshold in P-A Algorithm) … 32
3.2.4 網頁-動作路徑的Join(P-A_Join) …………………………………37
3.3 網頁-動作路徑演算法的演算實作 ………………………………………38
3.3.1 不重覆計算的演算方式(PA_Scan1 Algorithm) ………………… 38
3.3.2 重覆計算的演算方式(PAScan_N Algorithm) …………………… 43
3.3.3 找出關聯規則(Find The Association Rules) …………………… 49
3.4 一個可能真實例子的操作 ……………………………………………… 50
3.5 本演算法的效率與效益 ………………………………………………… 57

第肆章 雛形系統實作與數據分析 ……………………………………………… 60
4.1 雛形系統的功能介紹 …………………………………………………… 60
4.1.1 記錄處理子系統(P-A Log System)………………………………… 60
4.1.2 網頁-動作資料挖掘子系統(P-A Mining System)………………… 65
4.2 本雛形系統的執行效能 ………………………………………………… 70
4.2.1 資料量不同對於網頁-動作路徑演算法的影響…………………… 73
4.2.2 Support數不同對於網頁-動作路徑演算法的影響…………………74
4.2.3 資料型態不同對於網頁-動作路徑演算法的影響………………… 76

第伍章 結論與未來研究方向 …………………………………………………… 81
5.1 研究貢獻 ………………………………………………………………… 81
5.1.1 學術上的貢獻 ……………………………………………………… 81
5.1.2 實務上的貢獻 ……………………………………………………… 81
5.2 後續研究建議 …………………………………………………………… 82

參考文獻 ………………………………………………………………………… 84
zh_TW
dc.format.extent 74747 bytes-
dc.format.extent 128683 bytes-
dc.format.extent 110734 bytes-
dc.format.extent 145882 bytes-
dc.format.extent 185039 bytes-
dc.format.extent 320364 bytes-
dc.format.extent 708089 bytes-
dc.format.extent 114856 bytes-
dc.format.extent 82764 bytes-
dc.format.extent 96392 bytes-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.format.mimetype application/pdf-
dc.language.iso en_US-
dc.source.uri (資料來源) http://thesis.lib.nccu.edu.tw/record/#G0091356030en_US
dc.subject (關鍵詞) 全球資訊網zh_TW
dc.subject (關鍵詞) 網站使用挖掘zh_TW
dc.subject (關鍵詞) 資料挖掘zh_TW
dc.subject (關鍵詞) 使用者行為zh_TW
dc.subject (關鍵詞) 網頁路徑zh_TW
dc.subject (關鍵詞) 動作路徑zh_TW
dc.subject (關鍵詞) World-Wide-Weben_US
dc.subject (關鍵詞) Web-Useage Miningen_US
dc.subject (關鍵詞) Data-Miningen_US
dc.subject (關鍵詞) User Behavioren_US
dc.subject (關鍵詞) Page Pathen_US
dc.subject (關鍵詞) Action Pathen_US
dc.title (題名) 全球資訊網中使用者網頁-動作路徑的資料挖掘zh_TW
dc.type (資料類型) thesisen
dc.relation.reference (參考文獻) 1、陳仕昇、許秉瑜與陳彥良, "以可重複序列挖掘網路瀏覽規則之研究," 資訊管理學報,第六卷,第二期, 2000zh_TW
dc.relation.reference (參考文獻) 2、Agrawal, R., Ghosh, S., Imielinski, T., Iyer B. and Swami, A., "An Interval Classifier for Database Mining Applications," Proceeding of th 18th VLDB Confrence, Vancouver, Canada, August, 1992, pp.560-573.zh_TW
dc.relation.reference (參考文獻) 3、Agrawal, R., Imielinski, T. and Swami, A., "Database Mining: A Performance Perspective," IEEE Transactions on Knowledge, Vol.5, No.6, December, 1993, pp.914-925.zh_TW
dc.relation.reference (參考文獻) 4、Agrawal, R. and Srikant, R., "Fast Algorithms for Mining Association Rules," Proc. of the 20th Int`l Conference on Very Large Databases, Santiago, Chile, Sep. 1994.zh_TW
dc.relation.reference (參考文獻) 5、Agrawal, R. and Srikant, R., "Mining Sequential Patterns," Proc. of the Int`l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995.zh_TW
dc.relation.reference (參考文獻) 6、Chen, M-S., Han, J. and Yu, P. S., "Data Mining:An Overview from a Database Perspective," IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, 1996, pp. 866-883.zh_TW
dc.relation.reference (參考文獻) 7、Chen, M-S., Park J-S. and Yu, P. S., "Efficient Data Mining for Path Traversal Patterns," IEEE Trans. on Knowledge and Data Engineering, Vol. 10, No. 2, April 1998, pp. 209-221.zh_TW
dc.relation.reference (參考文獻) 8、Cooley, R., Mobasher, B. and Srivastava, J., "Web Mining: Information and Pattern Discovery on the World Wide Web," in Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI`97), Nov. 1997.zh_TW
dc.relation.reference (參考文獻) 9、Cooley, R., Mobasher, B. and Srivastava, J., "Grouping Web Page References into Transactions for Mining World Wide Web Browsing Patterns," Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97), Nov. 1997.zh_TW
dc.relation.reference (參考文獻) 10、Cooley, R., Mobasher, B. and Srivastava, J., "Data Preparation for Mining World Wide Web Browsing Patterns," Journal of Knowledge and Information Systems, Vol. 1, No. 1, 1999.zh_TW
dc.relation.reference (參考文獻) 11、Frawley, W.J., Paitetsky-Shapiro, G. and Matheus C. J., "Knowledge Discovery in Databases: An Overview," Knowledge Discovery in Databases, California, Edited by Paitetsky-Shapiro, G.. and Frawley,W.J., AAAI/MIT Express, 1991, pp.1-30.zh_TW
dc.relation.reference (參考文獻) 12、Han, J., Cai, Y. and Cercone, N., "Attribute-Oriented Induction in Relational Databases," in G. Piatetsky-Shapiro and W. J. Frawley(eds.), Knowledge Discovery in Databases, AAAI/MIT Press, 1991, pp. 213-228.zh_TW
dc.relation.reference (參考文獻) 13、Han, J., Cai, Y. and Cercone, N., "Knowledge Discovery in Databases : An Attribute-Oriented Approach," Proceeding of the 18th VLDB Conference, Canada, August, 1992, pp. 547-549.zh_TW
dc.relation.reference (參考文獻) 14、Han, J. and Fu, Y., "Discovery of Multiple-Level Association Rules from Large Databases," Oric. of 1995 Int`1 Conf. on Very Large Data Bases (VLDB`95), Zuich, Switzerland, September 1995, pp. 420-431.zh_TW
dc.relation.reference (參考文獻) 15、Han, J., Koperski, K. and Adhikary, J., "Spatial Data Mining: and Challenges," 1996 SIGMOD`96 Workshop. on Resarch Issues on Data Mining and Knowledge Discovery(DMKD`96), Montreal, Canada, June 1996.zh_TW
dc.relation.reference (參考文獻) 16、Han, J., Yang, Q. and Kim, E., "Plan Mining by Divide-and-Conquer," Proc. 1999 SIGMOD`99 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD`99) , Philadelphia, PA, May 1999, pp. 8:1-8:6zh_TW
dc.relation.reference (參考文獻) 17、Hsieh, C. C. and Chang, C.T., "An enhanced transaction identification module on web usage mining", Asia Pacific Management, pp.241~252, 2001.zh_TW
dc.relation.reference (參考文獻) 18、Lewinson., "Data Mining:Intelligent Technology Gets Down To Business," PC AI, Vol7, Vol.6 November/December 1993, pp.16-23.zh_TW
dc.relation.reference (參考文獻) 19、Pasquier, N., Bastide, Y., Taouil R. and Lakhal, L., "Efficient Mining Of Association Rules Using Closed Itemset Lattices," Information Systems, Vol. 24, No. 1, March 1999, pp. 25-46.zh_TW
dc.relation.reference (參考文獻) 20、Quinlan, J. R., "Induction of Decion Trees," Machine Learning, Vol1, No.1, 1986, pp.81-106.zh_TW
dc.relation.reference (參考文獻) 21、Ramkumar, G.D. and Swami, A., "Clustering data without distance functions", IEEE Technical committee on Data Engineering, Data(base) Engineering Bulletin, pp.9~14, 1998.zh_TW
dc.relation.reference (參考文獻) 22、Savasere, A., Omiecinski, E. and Navathe, S., "An Efficient Algorithm for Mining Association Rules in Large Databases," Proc. Int`l Conf. Very Large Data Bases, Zurich, Switzerland, Sep. 1995, pp. 432-444.zh_TW
dc.relation.reference (參考文獻) 23、Toivonen, H., "Sampling Large Databases For Association Rules," The 22th International Conference on Very Large Databases (VLDB`96), Mumbay, India, Sep. 1996, pp. 134-145.zh_TW
dc.relation.reference (參考文獻) 24、Yu C-C., and Chen, Y-L., "Mining sequential patterns from multi-dimensional sequence data," revised in IEEE Trans. On Knowledge and Data engineering. (SCI) ,2002zh_TW
dc.relation.reference (參考文獻) 25、Yun, C.H. and Chen, M.S., "Using pattern-join and purchase-combination for mining transaction patterns in an electronic commerce environment", The 24th Annual International Conference On Computer Software and Applications, Taipei, Taiwan, pp.99~104, 25-27 Oct 2000.zh_TW
dc.relation.reference (參考文獻) 26、 Zhang, W., Xu, B., Song, W., Yung, H. and Liu, K., "Data mining algorithms for web pre-fetching", Proceeding of the First International Conference On Web Information Systems Engineering, Hong Kong, China, Vol.2 pp.34-38, 19-21 June 2000.zh_TW