Open Access Open Access  Restricted Access Subscription or Fee Access

A Fast Boosting based Incremental Genetic Algorithm for Mining Classification Rules in Large Datasets

P. Vivekanandan, R. Nedunchezhian

Abstract


Genetic algorithm is a search technique purely based on natural evolution process.  It is widely used by the data mining community for classification rule discovery in complex domains. During the learning process it makes several passes over the data set for determining the accuracy of the potential rules. Due to this characteristic it becomes an extremely I/O intensive slow process. It is particularly difficult to apply GA when the training data set becomes too large and not fully available. An   incremental Genetic algorithm based on boosting phenomenon is proposed in this paper which constructs a weak ensemble of classifiers in a fast incremental manner and thus tries to reduce the learning cost considerably.

Keywords


Classification, Incremental Learning, Genetic Algorithm (Ga), Scalability, Boosting.

Full Text:

PDF

References


Linyu Yang, Dwi H. Widyantoro, Thomas Ioerger and John Yen, “An Entropy-based Adaptive Genetic Algorithm for Learning Classification Rules”, Proceedings of the 2001 Congress on Evolutionary Computation, , Issue 2001 Page(s):790 – 796,2001

K.A. De Jong., W.M. Spears and D.F. Gordon, “Using genetic algorithms for concept learning”, Machine Learning, volume 13, page(s) 161-188, 1993.

C.Z. Janikow, “A knowledge-intensive genetic algorithm for supervised learning”, Machine Learning, volume 13, page(s) 189-228, 1993.

D.P. Greene and S.F. Smith, “Competition-based induction of decision models from examples”, Machine Learning, volume 13, page(s) 229-257, 1993.

T.-L. Yu, D. E. Goldberg, and K. Sastry. Optimal sampling and speed-up for genetic algorithms on the sampled onemax problem. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2003), pages 1554--1565, 2003.

Wilson Rivera ,” Scalable Parallel Genetic Algorithms”, Artificial intelligence review, volume 16 pages: 153-168 2004

Dehuri, S., Mall, R. "Predictive and Comprehensible Rule Discovery Using A Multi-Objective Genetic Algorithm", Knowledge Based System, volume 19, pp: 413-421, 2006 (SCI).

S.U. Guan and F.ZhuCollard, “An incremental approach to genetic-algorithms based classification. Systems”, Man and Cybernetics, Part B, IEEE Transactions, volume 35, no. 2, page(s) 227 – 239, 2005.

Domingos, P., Hulten, G., “Mining high-speed data streams”, Proceedings KDD 2000, ACM Press, New York, NY, USA, pp. 71–80. 2000

Hulten, G.,Spencer, L.,Domingos.P,” Mining time-changing data streams”, Proceedings KDD 2001, ACM Press, New York, NY, pp. 97–106. 2001

Polikar, R. Upda, L. Upda, S.S. Honavar, V. “Learn++: an incremental learning algorithm for supervised neural networks “, IEEE Transactions on Systems, Man, and Cybernetics,Volume 31, Issue:4 On page(s): 497-508 . 2001

Jing Gao, Bolin Ding, Wei Fan, Jiawei Han,Philip S.Yu, “Classifying Data Streams with Skewed Class Distributions and Concept Drifts”, IEEE Internet Computing, Special Issue on Data Stream Management(IEEEIC),Nov/Dec. 2008, page(s)37-49, 2008.

Kenneth A. De Jong , William M. Spears, “Learning concept classification rules using genetic algorithms”, Proceedings of the 12th international joint conference on Artificial intelligence, p.651-656, August 24-30, Sydney, New South Wales, Australia. 1991.

D. L. A Araujo., H. S. Lopes, A. A. Freitas, “A parallel genetic algorithm for rule discovery in large databases” , Proc. IEEE Systems, Man and Cybernetics Conference, Volume 3, Tokyo, 940-945, 1999.

Wojciech Kwedlo, Marek Kretowski, “Discovery of Decision Rules from Databases: An Evolutionary Approach Principles of Data Mining and Knowledge Discovery”, Second European Symposium, PKDD '98, Nantes, France, September 23-26, 1998.

Xian-Jun Shi Hong Lei A Genetic Algorithm-Based Approach for Classification Rule Discovery”. International Conference on Information Management, Innovation Management and Industrial Engineering, 2008, Volume: 1, page(s): 175-178, 2008

Andre Treptow and Andreas Zell, “Combining adaboost learning and evolutionary search to select features for real-time object detection”,IEEE congress on Evolutionary Computation Volume: 2 Page(s): 2107 - 2113 Vol.2 2004.

Yoav Freund, Robert E.Schapire, “A Decision-theoretic Generalization of On-line Learning and an Application to Boosting”, Tech.rep., AT&T Bell Laboratories, Murray Hill, NJ, 1995.

Yoav Freund, Robert E.Schapire, “Experiments with a New Boosting Algorithm”, Proceeding of 13th international Conference on Machine Learning, pages:148-156,1996.

Robert E.Schapire, Yoram Singer, “Improved Boosting Algorithms Using Confidence-rated Predictions”, Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 80-91. 1998.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.