Open Access Open Access  Restricted Access Subscription or Fee Access

Predicting Agricultural Crop Pests with Hadoop MapReduce based Decision Tree Algorithm

R. Revathy


Data mining is a way of exploring large pre-existing databases in order to generate new information. It is used to find a relationship between the bulky data set which is very helpful in decision making. In agriculture sector, data mining can help farmers to develop yield. Crops can be protected from vertebrate pests and diseases by predicting and enhancing crop cultivation through efficient data mining methods. The main aim of this research is classifying agricultural crop pests which are categorized by different colors. This research work includes three phases namely data preprocessing, feature selection and execution of C5.0 algorithm using map reduce. Data preprocessing has taken away the noisy data in crop pest data that offers improved accuracy. In feature selection phase, Relief filter is applied for filtering attributes of the crop pest data set instead of using full attribute set. Relief performs a selection of instances by calculating the attribute weights based upon distances. This research work proposed Map Reduce implementation of C5.0 decision tree algorithm that is giving more accurate result rapidly and holding less memory of huge crop pest data set.


Data Mining, Data Preprocessing, Relief Filter, Reduce Based C5.0 Decision Tree.

Full Text:



Raorane, A. A., Kulkarni, R.V., “Data Mining: An effective tool for yield estimation in the agricultural sector”, International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), Volume.1, Issue.2, August 2012, pp: 75-79

Nasira, G. M., Hemageetha, N., “Vegetable Price Prediction Using Data Mining Classification Technique”, Proceedings of the International Conference on Pattern Recognition, Informatics and Medical Engineering, March 2012, pp: 100-102.

Sutha, S., Tamilselvi, J, J., “A Review of Feature Selection Algorithms for Data Mining Techniques”, International Journal on Computer Science and Engineering (IJCSE), Volume. 7, Issue.6, June 2015, pp: 62-67.

Porkodi, R., “COMPARISION OF FILTER BASED FEATURE SELECTION ALGORITHMS: AN OVERVIEW”, International Journal of Innovative Research in Technology and Science (IJIRTS), Volume. 2, Issue.2, pp: 108-113.

Krishna Kumar, V, S., Kiruthika, P., “An Overview of Classification Algorithm in Data Mining”, International Journal of Advanced Research in Computer andCommunication Engineering, Volume. 4, Issue.12, December 2015, pp: 255-257.

Yang, T., HiongNgu, H, A., “Implementation of Decision Tree Using Hadoop Map Reduce”, International Journal ofBiomedical Data Mining, Volume. 6, Issue. 1, 2016, PP: 1-4

Jinubala, V., and Lawrance, R., “Analysis of Missing Data and Imputation on Agriculture Data With Predictive Mean Matching Method”, International Journal of Science and Applied Information Technology (IJSAIT),, Volume.5, Issue.1, 2016, pp: 01-04.

Patel, R, B., Rana, K, K., “A Survey on Decision Tree Algorithm for Classification”, International Journal of Engineering Development and Research (IJEDR), Volume. 2, Issue.1, 2014, pp: 1-5.

Pundir, L, S., and Amrita., “FEATURE SELECTION USING RANDOM FOREST IN INTRUSION DETECTION SYSTEM”, International Journal of Advances in Engineering & Technology(IJAET), Volume. 6, Issue.6, July 2013, pp: 1319-1324.

Hen J. and Kamber M., “Data Mining: Concepts and Techniques, Second Edition, ELSEVIER Publications, ISBN: 978-81-312-0535-81, 2005.

Bharat, V., Shelale, B., Khandelwal, K., Navsare, S., “A Review Paper on Data Mining Techniques”, International Journal of Engineering Science and Computing (IJESC), Volume. 6, Issue.5, May 2016, pp: 6268-6271..

Rosario, F, S., and Thangadurai, K., “RELIEF: Feature Selection Approach”, INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH & DEVELOPMENT, Volume. 4, Issue. 11, October 2015.

Singh, S., and Gupta, P., “COMPARATIVE STUDY ID3, CART AND C4.5 DECISION TREE ALGORITHM: A SURVEY”, International Journal of Advanced Information Science and Technology (IJAIST), Volume. 27, Issue.27, July 2014, pp: 97-103.

Patil, N., Lathi, R.,Chitre, V., “Comparison of C5.0 & CART Classification algorithms using pruning Technique”, International Journal of Engineering Research& Technology (IJERT), Volume.1, Issue.4, June 2012, pp: 1-5.

Prajapati, V., “Big Data Analytics with R and Hadoop”, First Edition 2013.

Dai, W., Ji, W., “A MapReduce Implementation of C4.5 Decision Tree Algorithm”, International Journal of Database Theory and Application, Volume. 7, Issue.1, 2014, pp: 50-60.

El Seddawy, B, A., Sultan, T., and Khedr, A., “Applying Classification Techniques using DID3 Algorithm to improve Decision Support System under Uncertain Situations” International Journal of Modern Engineering Reasearch(IJMER), Volume. 3, Issue. 4, August 2013.

HSSINA, B., MERBOUHA, A., EZZIKOURI, H., and ERRITALI, M., “AComparative study of decision tree ID3 and C4.5”, International Journalof Advanced Computer Science and Applications (IJACSA), SpecialIssue on Advances in Vehicular Ad Hoc Networking and Applications, 2014, pp: 13-19.

Joshi, K, K., “Indian Agriculture Land through Decision Tree in Data Mining” International Journal of Core Engineering and Management(IJCEM), Volume. 1,Issue. 5, August 2014.

Veenadhari, S., Mishra, B., and Singh, CD., “Soyabean Productivity Modelling using Decision Tree Algorithms” International Journal of Computer Applications, Volume. 27, Issue. 7, August 2011.


  • There are currently no refbacks.