Enhanced Index based GenMax for Frequent Item Set Mining

S. Asokkumar; S. Thangavel

Enhanced Index based GenMax for Frequent Item Set Mining

S. Asokkumar, S. Thangavel

Abstract

In many data mining applications such as the discovery of association rules, strong rules, and many other important discovery tasks, mining frequent item sets is a fundamental and essential problem. Methods have been implemented for mining frequent item sets using a prefix-tree structure, for storing compressed information GenMax is used for mining maximal frequent item sets. It uses a technique called progressive focusing to perform maximal checking, and differential set propagation to perform fast frequency computation. Genmax algorithm was not implemented for closed frequent item set. The proposal in this paper present an improved index based enhancement on Genmax algorithm for effective fast and less memory utilized pruning of maximal frequent item and closed frequent item sets. The extension induces a search tree on the set of frequent closed item sets thereby we can completely enumerate closed item sets without duplications. The memory use of mining the maximal frequent item set does not depend on the number of frequent closed item sets. The proposed model reduce the number of disk I/Os and make frequent item set mining scale to large transactional databases. Experimental results shows a comparison of improved index based GenMax and existing GenMax for efficient pruning of maximal frequent and closed frequent item sets in terms of item precision and fastness.

Keywords

Index Mining, Frequent Item Set, Genmax, Association Rules, Data Mining, Transactional Databases

Full Text:

PDF

References

Adhikari, A. And Rao, P.R. (2008) „Synthesizing Heavy Association Rules From Different Real Data Sources‟, Pattern Recognition Letters, Vol. 29, No. 1, Pp.59–71.

Bagui, S., Just, J. And Bagui, S.C. (2009) „Deriving Strong Association Rule Mining Rules Using Dependency Criterion, The Lift Measure‟, Int. J. Data Analysis Techniques And Strategies, Vol. 1, No. 3, pp.97–312.

Baralis, E. Cerquitelli, T. andChiusano, S. “Index Support for Frequent Itemset Mining in a Relational DBMS,” 21st Int‟l Conf. Data Eng. (ICDE), 2005.

Botta, M. Boulicaut, J.-F. Masson, C. and Meo, R. “A Comparison between Query Languages for the Extraction of Association Rules,” Fourth Int‟l Conf. Data Warehousing and Knowledge Discovery (DaWak), 2002.

Cheung, Y.-L.“Mining Frequent Itemsets without Support Threshold: With and without Item Constraints,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 9, pp. 1052-1069, Sept. 2004.

Cong, G. and Liu, B. “Speed-Up Iterative Frequent Itemset Mining with Constraint Changes,” IEEE Int‟l Conf. Data Mining (ICDM ‟02), pp. 107-114, 2002

Faloutsos, C. Lin, K.-I. FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, pages 163–174, San Jose, CA, 1995.

Grahne, G. andZhu, J. (2004) “Mining Frequent Itemsets from Secondary Memory,” IEEE Int‟l Conf. Data Mining (ICDM ‟04), pp. 91-98

Han, J. Pei, J. and Yin, Y. (2000) “Mining Frequent Patterns without Candidate Generation,” ACM SIGMOD.

Leung, C.K.-S. Lakshmanan, L.V.S. and Ng, R.T. “Exploiting Succinct Constraints Using FP-Trees,” SIGKDD Explorations Newsletter, vol. 4, no. 1, pp. 40-49, 2002

Liu, G. Lu, H. Lou, W. and Yu, J.X. “On Computing, Storing and Querying Frequent Patterns,” Ninth ACM SIGKDD Int‟l Conf. Knowledge Discovery and Data Mining (SIGKDD), 2003

Pei, J. Han, J. and Lakshmanan, L.V.S. “Pushing Convertible Constraints in Frequent Itemset Mining,” Data Mining and Knowledge Discovery, vol. 8, no. 3, pp. 227-252, 2004.

Uno, T. Kiyomi, M.and Arimura, H. “LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets,” IEEE ICDM Workshop Frequent Itemset Mining Implementations (FIMI), 2004.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me