學術產出-Periodical Articles

Article View/Open

Publication Export

Google ScholarTM

政大圖書館

Citation Infomation

  • No doi shows Citation Infomation
題名 MISFP-Growth: Hadoop-Based Frequent Pattern Mining with Multiple Item Support
作者 張瑞岩
王貞淑*
貢獻者 資管博三
關鍵詞 big data analytics; Hadoop MapReduce parallel computing; frequent pattern discovery;multiple item support
日期 2019-05
上傳時間 30-Oct-2019 11:14:20 (UTC+8)
摘要 In practice, single item support cannot comprehensively address the complexity of items in large datasets. In this study, we propose a big data analytics framework (named Multiple Item Support Frequent Patterns, MISFP-growth algorithm) that uses Hadoop-based parallel computing to achieve high-efficiency mining of itemsets with multiple item supports (MIS). The proposed architecture consists of two phases. First, in the counting support phase, a Hadoop MapReduce architecture is employed to determine the support for each item. Next, in the analytics phase, sub-transaction blocks are generated according to MIS and the MISFP-growth algorithm identifies the frequency of patterns. To facilitate decision makers in setting MIS, we also propose the concept of classification of item (COI), which classifies items of higher homogeneity into the same class, by which the items inherit class support as their item support. Three experiments were implemented to validate the proposed Hadoop-based MISFP-growth algorithm. The experimental results show approximately 38% reduction in the execution time on parallel architectures. The proposed MISFP-growth algorithm can be implemented on the distributed computing framework. Furthermore, according to the experimental results, the enhanced performance of the proposed algorithm indicates that it could have big data analytics applications.
關聯 Applied Sciences, Vol.9, No.10, pp.2075
資料類型 article
dc.contributor 資管博三
dc.creator (作者) 張瑞岩
dc.creator (作者) 王貞淑*
dc.date (日期) 2019-05
dc.date.accessioned 30-Oct-2019 11:14:20 (UTC+8)-
dc.date.available 30-Oct-2019 11:14:20 (UTC+8)-
dc.date.issued (上傳時間) 30-Oct-2019 11:14:20 (UTC+8)-
dc.identifier.uri (URI) http://nccur.lib.nccu.edu.tw/handle/140.119/127184-
dc.description.abstract (摘要) In practice, single item support cannot comprehensively address the complexity of items in large datasets. In this study, we propose a big data analytics framework (named Multiple Item Support Frequent Patterns, MISFP-growth algorithm) that uses Hadoop-based parallel computing to achieve high-efficiency mining of itemsets with multiple item supports (MIS). The proposed architecture consists of two phases. First, in the counting support phase, a Hadoop MapReduce architecture is employed to determine the support for each item. Next, in the analytics phase, sub-transaction blocks are generated according to MIS and the MISFP-growth algorithm identifies the frequency of patterns. To facilitate decision makers in setting MIS, we also propose the concept of classification of item (COI), which classifies items of higher homogeneity into the same class, by which the items inherit class support as their item support. Three experiments were implemented to validate the proposed Hadoop-based MISFP-growth algorithm. The experimental results show approximately 38% reduction in the execution time on parallel architectures. The proposed MISFP-growth algorithm can be implemented on the distributed computing framework. Furthermore, according to the experimental results, the enhanced performance of the proposed algorithm indicates that it could have big data analytics applications.
dc.format.extent 1650204 bytes-
dc.format.mimetype application/pdf-
dc.relation (關聯) Applied Sciences, Vol.9, No.10, pp.2075
dc.subject (關鍵詞) big data analytics; Hadoop MapReduce parallel computing; frequent pattern discovery;multiple item support
dc.title (題名) MISFP-Growth: Hadoop-Based Frequent Pattern Mining with Multiple Item Support
dc.type (資料類型) article