Appling FP-Growth Algorithm to Reduce the Data Log Storage Size in Web Mining

R. Kousalya; S. Leo Philomin Raj

Appling FP-Growth Algorithm to Reduce the Data Log Storage Size in Web Mining

R. Kousalya, S. Leo Philomin Raj

Abstract

Frequent Patterns are very important in knowledge discovery and web data mining process such as mining of association rules, correlations etc. Many existing incremental mining algorithms are Apriori-based, which are not easily adoptable to solve association rule mining and find out the frequent pattern easily. An earlier approach for frequent pattern mining using web logs for web usage mining is used and this approach is called as HFPA. In this approach HFPA, is the technique which is applied to mine association rules from web logs using normal Apriori algorithm, but with few adaptations for improving the interestingness of the rules produced and for applicability for web usage mining. So, we perform clustering of the user sessions extracted from the Web logs to partition the users into several homogeneous groups with similar activities and then extract user profiles from each cluster as a set of relevant URLs. Data mining techniques have been applied to extract usage patterns from Web log data, this process is known as Web usage mining. The implementation also concentrates on the storage reduction. The proposed system implemented FP-Growth algorithm to reduce the data log storage size.

Keywords

Frequent Patterns, Apriori Algorithm, Association Rules, FP-Growth Algorithm

Full Text:

PDF

References

Kannan, S., & Bhaskaran, R. (2009) Association rule pruning based on interestingness measures with clustering. International Journal of Computer Science Issues, IJCSI, 6(1), 35-43.

Liqiang Geng and Howard J. Hamilton, “Interestingness Measures for Data Mining: A Survey”, ACM Computing Surveys, Vol. 38, No. 3, Article 9, September 2006.

P. Tan, V. Kumar, and J. Srivastava. “Selecting the Right Interestingness Measure for Association Patterns”. Technical Report 2002-112, Army High Performance Computing Research Center, 2002.

Liaquat Majeed Sheikh, Basit Tanveer, Syed Mustafa Ali Hamdani. “Interesting Measures for Mining Association Rules”, In Proceedings of INMIC 2004. 8th international Multitopic Conference, 2004, pp 641-644.

Tianyi Wu, Yuguo Chen, and Jiawei Han, “Association Mining in Large Databases: A Re- Examination of Its Measures”, In Proceedings of PKDD-2007, 11th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), Warsaw, Poland, September 17-21, 2007, pp 621-628.

R. Iváncsy and I. Vajk, “Time- and Memory- Efficient Frequent Itemset Discovering Algorithm for Association Rule Mining.” International Journal of Computer Applications in Technology, Special Issue on Data Mining Applications

Huang, X. (2007). Comparison of interestingness measures for web usage mining: An empirical study. International Journal of Information Technology & Decision Making (IJITDM), 6(1), 15-41.

Iváncsy, R., & Vajk, I. (2008). Frequent pattern mining in web log data. Journal of Applied Sciences at Budapest Tech, 3(1), Special Issue on Computational intelligence.

Web mining: information and pattern discovery on the World Wide Web R. Cooley, B. Mobasher, and J. Srivastava, 8 Nov 1997.

Relational clustering based on a new robust estimator with application to Web mining Nasraoui, O. Krishnapuram, R. Joshi, A. Missouri Univ., Columbia, MO 06 August 2002.

Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan, 29 September 2009.

From user Access Patterns to Dynamic Hypertext Linking, Tak Woon Yan,Matthew Jacobsen , Feb 1996.

Data Mining of User Navigation Patterns, Mark Levene and Jos´e Borges, August 29, 2000

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me