Neighborhood Density based Clustering with Agglomerative Fuzzy K-Means Algorithm
Abstract
Clustering is one of the primary tools in unsupervised
learning. Clustering means creating groups of objects based on their features in such a way that the objects belonging to the same groups are similar and those belonging to different groups are dissimilar. K-means is one of the most widely used algorithms in clustering because of its simplicity and performance. The initial centriod for k-means clustering is generated randomly. In this paper, we address a
method for effectively selecting initial cluster center. This method identifies the high density neighborhood (NSS) from the data and then select initial centroid of the neighborhoods as initial centers. Agglomerative Fuzzy k-means (Ak-means) clustering algorithm is then utilized to further merge these initial centers to get the preferred number of clusters and create better clustering results. Merging method is employed to produce more consistent clustering results from
different sets of initial clusters centers. Experimental observations on several data sets have proved that the proposed clustering approach was very significant in automatically identifying the true cluster
number and also providing correct clustering results.
Keywords
Full Text:
PDFReferences
Xiaoyun Chen, Youli Su, Yi Chen and Guohua Liu, "GK-means: an
Efficient K-means Clustering Algorithm Based on Grid", International
Symposium on Computer Network and Multimedia Technology, Pp. 1-4,
D.Pelleg and A.W.Moore, “X-means: Extending K-mean with efficient
estimation of the number of clusters”, In Proceedings of the Seventeenth
International Conference on Machine Learning table of contents, pages
–734. Morgan Kaufmann Publishers Inc. San Francisco, CA USA,
G. Hamerly and C. Elkan, “Alternatives to the K-Means Algorithm That
Find Better Clustering,” Proc. 11th Int’l Conf Information and
Knowledge Management, pp. 600-607,2002.
Y. Feng, "PG-means: learning the number of clusters in data", 2007.
M. Welling and K. Kurihara, “Bayesian K-means as a Maximization
expectation algorithm”, In Proceedings of the Sixth SIAM International
Conference on Data Mining, page 474. Society for Industrial
Mathematics, 2006.
Zhang Zhe, Zhang Junxi and Xue Huifeng, "Improved K- Means
Clustering Algorithm", Congress on Image and Signal Processing, Vol. 5,
Pp. 169-172, 2008.
Trujillo, M., Izquierdo, E., "Combining K-means and semivariogrambase
grid clustering", 47th International Symposium, Pp. 9-12, 2005.
Mark Junjie Li, Michael K. Ng, Yiu-ming Cheung, and Joshua Zhexue
Huang, “Agglomerative fuzzy k-means clustering algorithm with
selection of number of clusters”, IEEE Trans. on Knowl. And Data
Eng.,20(11):1519–1534, 2008.
Pavel Berkhin “Survey of Clustering Data Mining Techniques”.
Yanfeng Zhang, Xiaofei Xu and Yunming Ye, "NSS- AK-meanss:
An Agglomerative Fuzzy K-means clustering met hod with automatic
selection of cluster number", 2nd International Conference on Advanced
Computer Control, Vol. 2, Pp. 32-38, 2010
F. Hoppner, “Fuzzy Shell Clustering Algorithms in Image Processing:
Fuzzy c-Rectangular and 2- Rectangular Shells,” IEEE.Trans. Fuzzy
Systems, vol. 5, no. 4, pp. 599-613, 1997.
H.Frigui and R. Krishnapuram, “Clustering by Competitive
Agglomeration,” Pattern Recognition, vol 30, no. 7, pp. 1109-1119,
Y. Ye, J.Z. Huang, X. Chen, S. Zhou, G. Williams, and X. Xu“
Neighborhood Density Method for Selecting Initial Cluster Centers in
K-means Clustering”, In Proceedings of PAKDD, volume 6. Springer,
Shuigeng Zhou, Yue Zhao, Jihong Guan, and Joshua Huang,
“Aneighborhood-based clustering algorithm”, In Advances in
Knowledge Discovery and Data Mining, volume 3518 of Lecture Notes in
Computer Science, pages 361–371. Springer Berlin / Heidelberg, 2005.
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.