A Comparative Study on Frequent Item Set Generation Algorithms

M. Nirmala; Dr.V. Palanisamy

A Comparative Study on Frequent Item Set Generation Algorithms

M. Nirmala, Dr.V. Palanisamy

Abstract

The most significant tasks in data mining are the process of discovering frequent item sets and association rules. Numerous efficient algorithms are available in the literature for mining frequent item sets and association rules. The time required for generating frequent item sets plays an important role. Some algorithms are designed, considering only the time factor. Incorporating utility considerations in data mining tasks is gaining popularity in recent years. Our study includes depth analysis of algorithms and discusses some problems of generating frequent item sets from the algorithm. The time of execution for each data set is also well analyzed. The work yields a detailed analysis of the algorithms to elucidate the performance with standard dataset like Adult, Mushroom etc. The comparative study of algorithms includes aspects like different support values, size of transactions and different datasets.

Keywords

Data Mining, FP Growth, Frequent Item Set Mining, Mushroom

Full Text:

PDF

References

R.Agrawal, T. Imielinski, and A.Sawmi. Mining association rules between sets of items in large databases. In proc. of the ACM SIGMOD Conference on Management of Data, pages 207-216, 1993.

Alva Erwin, Raj P. Gopalan, N.R. Achuthan, 2007. “A Bottom-Up Projection Based Algorithm for Mining High Utility Itemsets”, In Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining,Vol. 84, pp. 3-11.

Tiwari, A., R.K. Gupta and D.P. Agrawal, 2009. A novel algorithm for mining frequent itemsets from large database. Int. J. Inform. Technol. Knowl. Manage., 2: 223-229.

J.Han, H. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In: Proc. Conf. on the Management of Data (SIGMOD’00, Dallas, TX). ACM Press, New York, NY, USA 2000.

R.Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT Press, 1996.

C.L. Blake and C.J. Merz. UCI Repository of Machine Learning Databases. Dept. of Information and Computer Science, University of California at Irvine, CA,USA1998 http://www.ics.uci.edu/˜mlearn/MLRepository.html- 1998.

R.Kohavi, C.E. Bradley, B. Frasca, L. Mason, and Z. Zheng. KDD-Cup 2000 Organizers’ Report: Peeling the Onion. SIGKDD Exploration 2(2):86–93. 2000.

M.Zaki, S. Parthasarathy, M. Ogihara, and W. Li.New Algorithms for Fast Discovery of Association Rules. Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining (KDD’97), 283–296. AAAI Press,Menlo Park, CA, USA 1997

G.Grahne and J. Zhu, 2003. “Efficiently Using Prefix-Trees in Mining Frequent Itemsets”, In proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI).

Hua-Fu Li, Suh-Yin Lee and Man-Kwan Shan, 2004. “An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams”, In Proceedings of the 1st Int’l. Workshop on Knowledge Discovery in Data Streams, pp. 20- 24.

J. Han, and M. Kamber, 2000. Data Mining Concepts and Techniques. Morgan Kanufmann.

P. Moen. Attribute, Event Sequence, and Event Type Similarity Notions for Data Mining. Ph.D. Thesis/Report A-2000-1, Department of Computer Science, University of Helsinki, Finland 2000

X. Wang, C. Borgelt, and R. Kruse. Mining Fuzzy Frequent Item Sets. Proc. 11th Int. Fuzzy Systems Association World Congress (IFSA’05, Beijing, China), 528–533. Tsinghua University Press and Springer-Verlag, Beijing, China, and Heidelberg, Germany 2005

G.I.Webb and S. Zhang. k-Optimal-Rule-Discovery. Data Mining and Knowledge Discovery 10(1):39–79. Springer, Amsterdam, Netherlands 2005

G.I. Webb. Discovering Significant Patterns. Machine Learning 68(1):1–33. Springer, Amsterdam, Netherlands 2007

Synthetic Data Generation Code for Associations and Sequential Patterns. Intelligent Information Systems, IBM Almaden Research Center http://www.almaden.ibm.com/software/quest/Resources/index.shtml

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me