一种进行K-Means聚类的有效方法

Abstract
Figure/Table
References
Related Citation (1)

Download: PDF (391 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract The existing K-Means clustering methods directly act on multidimensional datasets. Hence, these methods are extremely inefficient as the cardinality of input data and the number of clustering attributes increase. Motivated by the above fact, in this paper, an efficient approach for K-Means clustering based on the structure of regular grid, called KMCRG (K-Means Clustering based on Regular Grid), is proposed. This method effectively implements K-Means clustering by taking cell as handling object. Especially, this method uses the tactics of grid weighted iteration to effectively gain the final K classes. The experiment results show that the algorithm can quickly gain the clustering results without losing clustering precision.

Key words： K-Means Clustering Regular Grid Structure Performance Evaluation

Received: 28 July 2008

ZTFLH:

TP311.13

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	HUANG Zhen-Hua
	XIANG Yang
	ZHANG Bo
	WANG Dong
	LIU Xiao-Ling

Cite this article:

HUANG Zhen-Hua,XIANG Yang,ZHANG Bo等. An Efficient Method for K-Means Clustering[J]. , 2010, 23(4): 516-521.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/ OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2010/V23/I4/516

[1] MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations // Proc of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, USA, 1967: 281-297
[2] Tou J. Pattern Recognition Principles. Reading, USA: Addison-Wesley, 1974
[3] Linde Y, Buzo A, Gary R. An Algorithm for Vector Quantizer Design. IEEE Trans on Communication, 1980, 28(1): 84-95
[4] Chomicki J, Godfrey P, Gryz J, et al. Skyline with Presorting: Theory and Optimization // Proc of the International Conference on Intelligent Information Systems. Wroclaw, Poland, 2005: 216-225
[5] Birgin E G, Martinez J M, Ronconi D P. Minimization Subproblems and Heuristics for an Applied Clustering Problem. European Journal of Operational Research, 2003, 146(1): 19-34
[6] Kanungo T, Mount D M, Netanyaha N S, et al. An Efficient K-Means Clustering Algorithm: Analysis and Implementation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881-892
[7] Ester M, Kriegel H, Sander J, et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise // Proc of the 2nd International Conference on Knowledge Discovery and Data Mining. Portland, USA, 1996: 226-231
[8] Corral A, Almendros J M. A Performance Comparison of Distance-Based Query Algorithms Using R-Trees in Spatial Databases. Information Sciences: An International Journal, 2007, 177(11): 2207-2237
[9] Pei Jian, Jin Wen, Ester M, et al. Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces // Proc of the 31st International Conference on Very Large Data Bases. Trondheim, Norway, 2005: 253-264
[10] Xiong Xiaopeng, Mokbel M F, Aref W G. SEA-CNN: Scalable Processing of Continuous k-Nearest Neighbor Queries in Spatio and Temporal Databases // Proc of the 21st International Conference on Data Engineering. Tokyo, Japan, 2005: 643-654