Open Access Open Access  Restricted Access Subscription or Fee Access

Feature Selection Techniques with Distributed Data Mining Models

Dr.E. Chandra, P. Ajitha

Abstract


Data mediated knowledge discovery is essential for any end users for value added decision making. Discerning vital, accurate and precise knowledge in the classification, various feature subsets are necessary. Apart from feature selection, processing and representation of data is also indispensable for analysis and implementation of any knowledge. Principal Component Analysis is the used for data pre-processing and representation of data. Eigen vectors, co variance matrix are estimated for distributed environment where local and global set are computed and evaluated. It reduces the dimensionality of data. RELIEF, CMIM and other feature selection methods are discussed here in this paper. On selecting the features may increase the classification accuracy and enhance classification and prediction.

Keywords


Feature Selection, Models, Distributed Data Mining, PCA, Classification, CMIM, mRMR, RELIEF.

Full Text:

PDF

References


Syed Zahid Hassan Zaidi. Syed Sibte Raza Abidi and Selvakumar Manickam ,“Distributed Data Mining From Heterogeneous Healthcare Data Repositories : Towards an Intelligent Agent-Based Framework” in Proceedings of IEEE Symposium on Computer-Based Medical Systems.

Anup Kumar and Mehmed Kantardzic, Samuel madden, “Distributed Data Mining Framework and Implementations” in IEEE internet computing.

Hillol Kargupta and Krishnamoorthy Sivakumar, “Existential Pleasures of Distributed data mining” in Next Generation Challenges and Future Directions.

Yongian Fu , “Distributed Data Mining: An Overview”,

Chulmin Yun and Jihoon yang ,“Experimental Comparision of Feature Subset Selection Methods” in IEEE Computer Soceity.

A.L.Blim and P.Langley , “Selection of Relevant Feature features and examples in machine Learning” in Artificial Intelligence , 1997 vol 97 no. 245-271.

S.Duodit, J.Fridlyand and T.P speed “Comparision of Discrimination methods for the classification of tumors using Gene expression data” in Journal of American statistical association, 2002 vol 97 no 457 pp 77-87

Chao-Ton su, Chien-Hsin Yang,Kuang-Hung hsu, en-Ko Chiu ,“DataMining for the Diabetes of type II diabetes from Three dimensional body surface anthropemetrical scanning data” in Elsevier computer and mathematics with application 51(2006) 1075-1092.

Prakash Kadel and Ho-Jin Choi , “Incremental Algorithm for Distributed Data Mining” in the proceedings of the sixth International Conference on Computing and Information technology.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.