Open Access Open Access  Restricted Access Subscription or Fee Access

Profit and Quantity Oriented Two Efficient Approaches for Utility Pattern Mining

Parvinder S. Sandhu, Dalvinder Singh Dhaliwal, S.N. Panda

Abstract


Traditional methods of association rule mining consider the appearance of an item in a transaction, whether or not it is purchased, as a binary variable. But, the quantity of an item purchased by the customers may be more than one, and the unit cost may not be the same for all items. A generalized form of the share mining model introduced to overcome this problem is utility mining. Developing an efficient algorithm is vital for utility mining because high utility itemsets cannot be identified by the pruning strategy. In this paper, we present two efficient approaches for utility pattern mining with the aid of FP-growth algorithm. The efficiency of utility pattern mining is achieved with two major concepts: 1) Incorporating the utility values after mining the frequent patterns (IUA-FP). Here, the patterns that are mined from the FP-growth algorithm are utilized to generate high utility patterns using internal and external utility. 2) Incorporating the utility values before mining the frequent patterns (IUB-FP). At this point, individual items that are less significant are taken out from the input database by considering their frequency along with their internal and external utility. Then, we apply the FP-growth algorithm in the transformed database to mine high utility patterns. Experimentation is carried out on these two concepts using synthetic dataset, T10I4D100K, attained from the IBM dataset generator and the performance study shows that the proposed two approaches are efficient in mining high utility patterns.


Keywords


Data Mining, Association Rule Mining, FP-Growth Algorithm, Frequent Patterns, Utility, Transaction Utility

Full Text:

PDF

References


R. Agrawal, T. Imielinski, A., Swami, “ Mining association rules between sets of items in large databases”, Proceedings of 1993 ACM. SIGMOD Intl. Conf. on Management of Data, Washington, DC, 1993, pp. 207–216.

R. Agrawal, R. Srikant, “Fast algorithms for mining association rules”,Proceedings 20th Intl. Conf. on Very Large Data Bases, Santiago, Chile,1994, pp. 487–499 .

M.-S. Chen, J. Han, P.S. Yu, “Data mining: an overview from a database perspective”, IEEE Transactions on Knowledge and Data Engineering,Vol. 8, 1996, pp. 866–883.

M. Kantardzic, Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley & Sons Inc., New York, 2002.

H. Yao, H.J. Hamilton, C.J. Butz, “A foundational approach to mining itemset utilities from databases”, Proceedings 4th SIAM Intl. Conf. on Data Mining, Lake Buena Vista, FL, 2004, pp. 482–486.

Guangzhu Yu, Shihuang Shao, Xianhui Zeng, “ Mining Long High Utility Itemsets in Transaction Databases”, WSEAS Transactions on Information Science & applications, No. 2, Vol. 5, 2008, pp. 202- 210.

S. Kotsiantis, R. Kanellopoulos, “Association Rules Mining: A Recent Overview, “GESTS International Transactions on Computer Science and Engineering” , Vol. 32, No. 1, 2006, pp. 71-82.

A. Ceglar, J. F. Roddick, “Association mining. ACM Computing Surveys”, Vol. 38, No. 2, Article 5, 2006.

Y. Xu, Y. Li, “Generating concise association rules”, Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management”, 2007, pp. 781-790 .

M. Sulaiman Khan, Maybin Muyeba, Frans Coenen, “A Weighted Utility Framework for Mining Association Rules”, Second UKSIM European Symposium on Computer Modeling and Simulation, Liverpool, 2008, pp.87 - 92.

Vid Podpecan, Nada Lavrac, Igor Kononenko, “A Fast Algorithm for Mining Utility-Frequent Itemsets”, 2007, doi: 10.1.1.93.5187.

S. Shankar, T. Purusothaman, “Utility Sentient Frequent Itemset Mining and Association Rule Mining: A Literature Survey and Comparative Study”, International Journal of Soft Computing Applications, No. 4, pp.81-95, 2009.

Alva Erwin, Raj P. Gopalan, N. R. Achuthan, “A Bottom-Up Projection Based Algorithm for Mining High Utility Itemsets”, Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining, Vol. 84, 2007, pp. 3-11.

J. Han, H. Pei, Y. Yin, “Mining Frequent Patterns without Candidate Generation”, Proc. Conf. on the Management of Data (SIGMOD’00, Dallas, TX). ACM Press, New York, NY, USA , 2000.

Alva Erwin, Raj P. Gopalan, N. R. Achuthan, “CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach”, Seventh IEEE International Conference on Computer and Information Technology, 2007, pp. 71-76.

Renata Ivancsy, Istvan Vajk, “Fast Discovery of Frequent Itemsets: a Cubic Structure-Based Approach”, Informatica, Vol. 29, 2005, pp.71–78.

Chun-Jung Chua, Vincent S. Tsengb, “ Tyne Liang: An efficient algorithm for mining high utility itemsets with negative item values in large databases”, Applied Mathematics and Computation, Vol. 215, No.2, 2009, pp. 767-778.

Jieh-Shan Yeh, Po-Chiang Hsu, “ HHUIF and MSICF: Novel algorithms for privacy preserving utility mining”, Expert Systems with Applications, Vol. 37, No. 7, July 2010, pp. 4779-4786.

Chun-Jung Chua, Vincent S. Tseng, Tyne Liang, “ An efficient algorithm for mining temporal high utility itemsets from data streams”, Journal of Systems and Software, Vol. 81, No. 7, July 2008, pp. 1105-1117.

Yu-Chiang Lia, Jieh-Shan Yehb, Chin-Chen Chang, “Isolated items discarding strategy for discovering high utility itemsets”, Data & Knowledge Engineering, Vol. 64, No. 1, Jan. 2008, pp. 198-217.

Jyothi Pillai, O. P. Vyas, Sunita Soni, Maybin Muyeba, “A Conceptual Approach to Temporal Weighted Itemset Utility Mining”, International Journal of Computer Applications, Vol. 1, No. 28, 2010, pp. 0975 – 8887.

Younghee Kim, Wonyoung Kim, Ungmo Kim, “ Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams”, Journal of Information Processing Systems, Vol. 6, No. 1, March 2010, pp. 79-90.

Jieh-Shan Yeh , Yu-Chiang Li and Chin-Chen Chang, “Two-Phase Algorithms for a Novel Utility-Frequent Mining Model”, Lecture Notes in Computer Science. Springer, Berlin/ Heidelberg, Vol. 4819, Nov.2007, pp.433-444.

Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, Byeong-Soo Jeong, Young-Koo Lee, “An Efficient Candidate Pruning Technique for High Utility Pattern Mining”, Advances in Knowledge Discovery and Data Mining”, Lecture Notes in Computer Science, Vol. 5476, 2009, pp. 749-756.

Jiawei Han, Jian Pei, Yiwen Yin, Runying Mao, “ Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach”, Data Mining and Knowledge Discovery, Vol. 8, No. 1,2004, pp. 53 – 87.

Aiman Moyaid Said, P D. D. Dominic, Azween B Abdullah, “A Comparative Study of FP-growth Variations”, IJCSNS International Journal of Computer Science and Network Security, Vol. 9, No. 5, 2009,pp. 266- 272.

Christian Borgelt, “An Implementation of the FP-growth Algorithm”, Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, Chicago, Illinois,2005, pp. 1 – 5.

Jiawei Han, Hong Cheng, Dong Xin, Xifeng Yan, “ Frequent pattern mining: current status and future directions”, Data Mining and Knowledge Discovery, Vol. 15, No. 1, 2007, pp. 55-86.

G. Grahne, J. Zhu, “Fast Algorithm for frequent Itemset Mining Using FP-Trees”, IEEE Transactions on Knowledge and Data Engineer, Vol.17, No.10, 2005.

Ying Liu, Wei-keng Liao, Alok Choudhary, “A Fast High Utility Itemsets Mining Algorithm”, Utility-Based Data Mining Workshop, 11th SIGKDD 2005.

S. Shankar, T. Purusothaman, S. Jayanthi, Nishanth Babu, “A Fast Algorithm for Mining High Utility Itemsets”, Proceedings of IEEE International Advance Computing Conference (IACC 09), Thapar University, Patiala, March 6-7, 2009, pp. 1459- 1464.

Li Haoyuan, Yi Wang, Dong Zhang, Ming Zhang, Edward Chang, “PFP:Parallel FP-Growth for Query Recommendation”, Proceedings of the 2008 ACM conference on Recommender systems, October 23-25,Lausanne, Switzerland, 2008.

Pei Jian, Jiawei Han, B. Mortazavi-Asl, H. Jianyong Wang Pinto, U. Qiming Chen Dayal, Mei-Chun Hsu, “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 11, 2004, pp. 1424 – 1440.

Bay Vo, Huy Nguyen, Bac Le, “Mining High Utility Itemsets from Vertical Distributed Databases”, Proceedings of the International Conference on Computing and Communication Technologies, Nang,2009, pp. 1-4 .


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.