Open Access Open Access  Restricted Access Subscription or Fee Access

A Novel Efficient Data Structure to Mine Frequent Itemset

Dr. E. Ramaraj, K. Ramesh kumar

Abstract


Association rule mining is to extract the interesting correlation and relation between the large volumes of databases.Association rule mining process is divided into two sub problem: The first problem is to find the frequent itemsets from the transaction and second problem is to construct the rule from the mined frequent itemset. Frequent itemsets generation is the prerequisite and most time overwhelming process for association rule mining. Apriori algorithm is the familiar and fundamental algorithm to generate the frequent itemsets from the transaction sets. Till now, Lot of researcher modified the Apriori in various manner like partition approach, Hash function and etc. But most efficient Apriori-like algorithms rely heavily on the minimum support constraints to prune the vast amount of non candidate itemsets. These algorithms store many unwanted itemsets and transactions. In this paper propose a novel frequent itemsets generation algorithm. The drawback of the HEA, AprioriTId and Apriori overcome by the proposed algorithm. The proposed algorithm is an improved version of High Efficient AprioriTid (HEA)algorithm. The proposed algorithm is using the two theorems which are proposed in this paper. The proposed algorithm is tested with the synthetic retail dataset. It performed well at low supports. The experimental reports also show that proposed algorithm on an outset is faster than HEA, AprioriTID and Apriori..


Keywords


Data Mining, Association rule mining, Frequent itemsets, Transaction Reduction

Full Text:

PDF

References


Agrawal, T Imielinski, A Swami, “Mining association rules between sets of items in large databases”, Proceedings of the ACM SIGMOD Conference on management of data, 1993, pp. 207-216.

Apriori public implementation http://fuzzy.cs.uni-magdeburg.de/~borgelt/software.

Available: http://fuzzy.cs.unimagdeburg.de/~borgelt/software.html

Basket and Sequence Analysis Benchmarks, XELPOES Java Version 1.1.7, Prudsys AG, Germany.

C. Borgelt and R. Kruse. Induction of association rules: Apriori implementation. In Proceedings of the 15th Conference on Computational Statistics, 2002, pp. 395–400.

Data Mining – Concepts and Techniques, Jiawei Han, Micheline Kamber.

FIMI dataset – http://fimi.cs.helsinki.fi/

FPGrowth public implementation –

www.csc.liv.ac.uk/`frans/KDD/Software/FPgrowth/fpGrowth.html

Ja-Hwung Su, Wen-Yang Lin: CBW: An efficient algorithm for Frequent Itemset Mining, Proceedings of 37th Hawaii International Conference on System Science – 2004.

Jiawei Han, Jian Pei and Yiwen Yin, Mining Frequent Pattern without Candidate Generation, School of Computing Science, Simon Fraser University.

Li, S. Xiyu, L. Ming, G, “Improvement of AprioriTid Algorithm for Mining Association Rules”, Journal of Yantai University, Vol.16, No.4,2003, pp.20-22.

R. Agrawal and R. Srikant. Fast algorithms for mining association rules.IBM Research Report RJ9839, IBM Almaden Research Center, San Jose,California, June 1994.

R.Agrawal and R. Srikant. Fast algorithms for mining association rules. In J.B. Bocca, M. Jarke, and C. Zaniolo, editors, Proceedings 20th International Conference on Very Large Data Bases, 1994, pp.487–499

R.Agrawal and R. Srikant. Quest Synthetic Data Generator. IBM Almaden Research Center, San Jose, California,ttp://www.almaden.ibm. com /cs /quest / syndata.html.

R.Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In Knowledge Discovery and Data Mining,1996, pp. 307–328.

R.Agrawal, J.Shafer, “Parallel mining of association rules”, IEEE Transactions on knowledge and Data Engineering, 8(6), December 1996,pp.962-969 .

R.Agrawal, T. Imielinski, and A.N. Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia,editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, volume 22(2) of SIGMOD Record,1993, pp. 207–216.

R.C.Agarwal, C.C. Agarwal, and V.V.V. Prasad. Depth first generation of long patterns, pp. 108–118.

Survey on Frequent Pattern Mining, Bart Goethals, HIIT Basic Research Unit, University of Helsinki, Finland.

Zhi-Chao Li; Pi-Lian He; Ming Lei, “A High Efficient AprioriTID Algorithm for mining Association rule”, Proceedings of 4th International Conference on machine learning and cybernetics, Guangzhou, China,2005, pp.18-21.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.