Open Access Open Access  Restricted Access Subscription or Fee Access

A Survey on Filter Approach for Feature Selection in Data Mining

V. Arul Kumar, Dr. L. Arockiam

Abstract


Rapid development of various new technologies in different fields will lead to the generation of larger volume of data. Due to the increment of data, it became a very tedious task for human to manage and to understand them. Machine Learning has the capability of processing larger volume of data. However, the application contains high dimension of features which becomes a challenging task to machine learning to extract the useful information. Many traditional learning algorithms fail due to the increase of dimensionality. The presence of irrelevant, redundant and noisy features, leads to the performance degradation of learning algorithm. Hence, feature selection technique is developed to solve the high dimensionality problem in machine learning. The goal of this survey is to provide a comprehensive review of various feature selection algorithm in filter approach.

Keywords


Data Mining, Machine Learning, Feature Selection, Feature Selection Algorithms, Filter Approach

Full Text:

PDF

References


Qinghua Hu, Weiwei Pan,Lei Zhang, Zhang, D, Yanping Song,,Maozu Guo,Daren Yu, "Feature Selection for Monotonic Classification", IEEE Transactions on Fuzzy System, Volume 20, Issue 1, February 2011, pp. 69-81

M. Ramaswami and R. Bhaskaran, "A Study on Feature Selection Techniques in Educational Data Mining", Journal of Computing Volume 1, Issue 1, December 2009, pp.7-11.

Sven F.Crone, NikolaosKourentzes, “Feature selection for time series prediction–A combined filter and wrapper approach for neural networks”, Journal of Neurocomputing, Volume 73, Issues 10-12, June-2010,pp. 1923-1936.

Jasmina Novaković, Perica Strbac, Dusan Bulatović, Toward Optimal Feature Selection Using Ranking Methods And Classification Algorithms, Yugoslav Journal Of Operations Research, Vol 21,Number 1,2011, pp. 119-135.

Lei Yu, Huan Liu, "Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution", International Conference on Machine Learning, Washington DC, 2003, pp.856-863.

Lei Yu and Huan Liu, “Efficient Feature Selection via Analysis of Relevance and Redundancy", Journal of Machine Learning Research, Volume 5, 2004, pp. 1205–1224

Sa Wang, Cheng-Lin Liu, Lian Zheng, "Feature Selection by Combining Fisher Criterion and Principal Feature Analysis", Sixth International Conference on Machine Learning and Cybernetics, Hong Kong, August 2007, pp. 1149-1154.

Xiubo Geng, Tie-Yan Liu, Tao Qin1, Hang Li, “Feature Selection for Ranking”, ACM SIGIR Conference, Amsterdam, July 2007, pp. 115 – 132.

Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong, “Feature Selection using PSO-SVM" Yang,I nternational Journal of Computer Science, Volume 33 number 1, pp. 111–116

Noelia S´anchez-Maro˜no, Amparo Alonso-Betanzos, and Mar´ıa Tombilla-Sanrom´an, "Filter Methods for Feature Selection – A Comparative Study", Springer-Verlag Berlin Heidelberg, 2007, pp. 178–187

Athanasios Tsanas, Max A. Little, Patrick E. McSharry, "A Simple Filter Benchmark for Feature Selection", Journal of Machine Learning Research, 2010. pp. 1-24.

B.M.Vidyavathi,C.N.Ravikumar, "A Novel Hybrid Filter Feature Selection Method For Data Mining", Ubiquitous Computing And Communication Journal, July, 2008, pp. 118 – 121.

Antonio Arauzo-Azofra,Jose Manuel Benitez, Juan Luis Castro"Consistency measures for feature selection "Journal of Intelligent Information Systems, Volume 30, 2008, pp. 273-292.

[Cha, 2009] Chandra Shekhar Dhirand, Soo Young Lee, “Hybrid Feature Selection-Combining Fisher Criterion and Mutual Information for Efficient Feature Selection”, Springer - Verlag berlin Heidelberg, 2009, pp. 613-620.

Appavu alias Balamurugan, Pramala, Rajalakshimi and Rajaram,"Feature selection for large scale data by combining class association rule mining and information gain: a hybrid approach" Internetworking, Volume 1, number 2, 2009, pp. 17-23.

Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse, “A Comparative Study of Threshold-based Feature Selection Techniques", IEE International Conference on Granular Computing, 2010, pp. 499-504

Yuxuan SUN, Xiaojun LOU, Bisai BAO, "A Novel Relief Feature Selection Algorithm Based on Mean-Variance Model", Journal of Information & Computational Science, volume 8, issue 16, 2001, pp.3921-3929.

Debahuti Mishra, Barnali Sahu, "Feature Selection for Cancer Classification: A Signal-to-noise Ratio Approach", International Journal of Scientific & Engineering Research, Volume 2, Issue 4, April-2011, pp.1-7.

Tingquan Deng, Chengdong Yang, Qinghua Hu, "Feature Selection in Decision Systems Based on Conditional Knowledge Granularity", International Journal of Computational Intelligence Systems, Volume. 4, Number. 4, June, 2011,pp. 655-671

Qinbao Song, Jingjie Ni and Guangtao Wang, “A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data”, IEEE Transactions on Knowledge and Data Engineering 2011. pp. 1-14.

Myo Khaing, Nang Saing Moon Kham,"Feature Selection Using Modified-MCA Based Scoring Metric for Classification", International Journal of Information and Education Technology, Volume. 1, Number. 5, December 2011. pp. 48-53.

Boyang Li, Qiangwei Wang, Jinglu Hua, "Feature Subset Selection: A Correlation-Based SVM Filter Approach", IEEJ Transactions on Electrical and Electronic Engineering, 2011, pp. 173–179.

Danyang CAO, Nan MA, Yongbing LIU, Jianwei GUO, "A Feature Selection Algorithm for Continuous Attributes Based on the Information Entropy", Journal of Computational Information Systems, Volume 8,Issue 4, 2012, pp. 1467-1475

S.Chinna Gopi, V.S.N. Reddy, M.Chandana, T.Anil Kumar, "An Efficient Method to Solve Optimization Problem in Feature Selection", International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 4, April 2010,pp. 283-286

S. Vijayasankari, K. Ramar, "Enhancing Classifier Performance Via Hybrid Feature Selection and Numeric Class Handling- A Comparative Study", International Journal of Computer Applications, Volume 41, Number 17, March 2012, pp.30-36.


Refbacks

  • There are currently no refbacks.