An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams | Publication | NCCU Academic Hub

Publications-Proceedings

Article View/Open

pdf(673)

Publication Export

Google Scholar^TM

NCCU Library

Discovery System

Citation Infomation

No doi shows Citation Infomation

Related Publications in TAIR

Simple Record
Full Record

題名	An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams
作者	沈錳坤
貢獻者	國立政治大學資訊科學系
關鍵詞	Mining;Frequent Itemsets;History of Data Streams
日期	2004-09
上傳時間	27-May-2010 16:50:45 (UTC+8)
摘要	A data stream is a continuous, huge, fast changing, rapid, infinite sequence of data elements. The nature of streaming data makes it essential to use online algorithms which require only one scan over the data for knowledge discovery. In this paper, we propose a new single-pass algorithm, called DSM- FI (Data Stream Mining for Frequent Itemsets), to mine all frequent itemsets over the entire history of data streams. DSM-FI has three major features, namely single streaming data scan for counting itemsets` frequency information, extended prefix-tree-based compact pattern representation, and top-down frequent itemset discovery scheme. Our performance study shows that DSM-FI outperforms the well-known algorithm Lossy Counting in the same streaming environment.
關聯	First International Workshop on Knowledge Discovery in Data Streams, in conjunction with the European Conference on Machine Learning (ECML) and the European Conference on the Principals and Practice of Knowledge Discovery in Dataabse (PKDD)
資料類型	conference

dc.contributor	國立政治大學資訊科學系	en_US
dc.creator (作者)	沈錳坤	zh_TW
dc.date (日期)	2004-09	en_US
dc.date.accessioned	27-May-2010 16:50:45 (UTC+8)	-
dc.date.available	27-May-2010 16:50:45 (UTC+8)	-
dc.date.issued (上傳時間)	27-May-2010 16:50:45 (UTC+8)	-
dc.identifier.uri (URI)	http://nccur.lib.nccu.edu.tw/handle/140.119/39799	-
dc.description.abstract (摘要)	A data stream is a continuous, huge, fast changing, rapid, infinite sequence of data elements. The nature of streaming data makes it essential to use online algorithms which require only one scan over the data for knowledge discovery. In this paper, we propose a new single-pass algorithm, called DSM- FI (Data Stream Mining for Frequent Itemsets), to mine all frequent itemsets over the entire history of data streams. DSM-FI has three major features, namely single streaming data scan for counting itemsets` frequency information, extended prefix-tree-based compact pattern representation, and top-down frequent itemset discovery scheme. Our performance study shows that DSM-FI outperforms the well-known algorithm Lossy Counting in the same streaming environment.	-
dc.language	en-US	en_US
dc.language.iso	en_US	-
dc.relation (關聯)	First International Workshop on Knowledge Discovery in Data Streams, in conjunction with the European Conference on Machine Learning (ECML) and the European Conference on the Principals and Practice of Knowledge Discovery in Dataabse (PKDD)	en_US
dc.subject (關鍵詞)	Mining;Frequent Itemsets;History of Data Streams	en_US
dc.title (題名)	An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams	en_US
dc.type (資料類型)	conference	en