學術產出-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

題名 DSM-PLW: Single-pass mining of path traversal patterns over streaming Web click-sequences
作者 Shan, Man-kwan;Li, Hua-fu;Lee, Suh-yin
沈錳坤
貢獻者 資科系
關鍵詞 Web click-sequence streams; Path traversal patterns; Single-pass algorithm
日期 2006
上傳時間 17-Jun-2015 15:44:23 (UTC+8)
摘要 Mining Web click streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some interesting characteristics, such as unknown or unbounded length, possibly a very fast arrival rate, inability to backtrack over previously arrived click-sequences, and a lack of system control over the order in which the data arrive. In this paper, we propose a projection-based, single-pass algorithm, called DSM-PLW (Data Stream Mining for Path traversal patterns in a Landmark Window), for online incremental mining of path traversal patterns over a continuous stream of maximal forward references generated at a rapid rate. According to the algorithm, each maximal forward reference of the stream is projected into a set of reference-suffix maximal forward references, and these reference-suffix maximal forward references are inserted into a new in-memory summary data structure, called SP- forest (Summary Path traversal pattern forest), which is an extended prefix tree-based data structure for storing essential information about frequent reference sequences of the stream so far. The set of all maximal reference sequences is deter- mined from the SP-forest by a depth-first-search mechanism, called MRS-mining (Maximal Reference Sequence mining). Theoretical analysis and experimental studies show that the proposed algorithm has gently growing memory requirements and makes only one pass over the streaming data. � 2005 Elsevier B.V. All rights reserved.
關聯 Computer Networks - COMPUT NETW , vol. 50, no. 10, pp. 1474-1487
資料類型 article
DOI http://dx.doi.org/10.1016/j.comnet.2005.10.018
dc.contributor 資科系
dc.creator (作者) Shan, Man-kwan;Li, Hua-fu;Lee, Suh-yin
dc.creator (作者) 沈錳坤zh_TW
dc.date (日期) 2006
dc.date.accessioned 17-Jun-2015 15:44:23 (UTC+8)-
dc.date.available 17-Jun-2015 15:44:23 (UTC+8)-
dc.date.issued (上傳時間) 17-Jun-2015 15:44:23 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/75893-
dc.description.abstract (摘要) Mining Web click streams is an important data mining problem with broad applications. However, it is also a difficult problem since the streaming data possess some interesting characteristics, such as unknown or unbounded length, possibly a very fast arrival rate, inability to backtrack over previously arrived click-sequences, and a lack of system control over the order in which the data arrive. In this paper, we propose a projection-based, single-pass algorithm, called DSM-PLW (Data Stream Mining for Path traversal patterns in a Landmark Window), for online incremental mining of path traversal patterns over a continuous stream of maximal forward references generated at a rapid rate. According to the algorithm, each maximal forward reference of the stream is projected into a set of reference-suffix maximal forward references, and these reference-suffix maximal forward references are inserted into a new in-memory summary data structure, called SP- forest (Summary Path traversal pattern forest), which is an extended prefix tree-based data structure for storing essential information about frequent reference sequences of the stream so far. The set of all maximal reference sequences is deter- mined from the SP-forest by a depth-first-search mechanism, called MRS-mining (Maximal Reference Sequence mining). Theoretical analysis and experimental studies show that the proposed algorithm has gently growing memory requirements and makes only one pass over the streaming data. � 2005 Elsevier B.V. All rights reserved.
dc.format.extent 1114297 bytes-
dc.format.mimetype application/pdf-
dc.relation (關聯) Computer Networks - COMPUT NETW , vol. 50, no. 10, pp. 1474-1487
dc.subject (關鍵詞) Web click-sequence streams; Path traversal patterns; Single-pass algorithm
dc.title (題名) DSM-PLW: Single-pass mining of path traversal patterns over streaming Web click-sequences
dc.type (資料類型) articleen
dc.identifier.doi (DOI) 10.1016/j.comnet.2005.10.018
dc.doi.uri (DOI) http://dx.doi.org/10.1016/j.comnet.2005.10.018