|
|
A Fuzzy Cluster Validity Index Based on Clustering Mistake Measures |
BEN Sheng-Lan,SU Guang-Da |
Department of Electronic Engineering,Tsinghua University,Beijing 100084 |
|
|
Abstract The mistakes in fuzzy clustering can be categorized into two types: classifying data originated from different classes into one cluster and classifying data originated from the same class into different clusters. In this paper, intra-class non-consistency and inter-class overlapping are defined to measure the two kinds of mistakes respectively. A good fuzzy partition is expected to have few clustering mistakes and large compactness. Based on the two mistake measures and cluster compactness, a cluster validity index is proposed to evaluate the clustering results. Experimental results show the effectiveness and the robustness of the proposed validity index in determining optimal number of clusters.
|
Received: 22 April 2009
|
|
|
|
|
[1] Bezdek J C. Numerical Taxonomy with Fuzzy Sets. Journal of Mathematical Biology, 1974, 1(1): 57-71 [2] Bezdek J C. Cluster Validity with Fuzzy Sets. Journal of Cybernetics, 1974, 3(3): 58-72 [3] Xie X L, Beni G. A Validity Measure for Fuzzy Clustering. IEEE Trans on Pattern Analysis and Machine Intelligence, 1991, 13(8): 841-847 [4] Wang Weina, Zhang Yunjie. On Fuzzy Cluster Validity Indices. Fuzzy Sets and Systems, 2007, 158(19): 2095-2117 [5] Dave R N. Validating Fuzzy Partition Obtained through c-Shells Clustering. Pattern Recognition Letters, 1996, 17(6): 613-623 [6] Chen Minyou, Linkens D A. Rule-Base Self-Generation and Simplification for Data-Driven Fuzzy Models. Fuzzy Sets and Systems, 2004, 142(1): 243-265 [7] Wu K L, Yang M S. A Cluster Validity Index for Fuzzy Clustering. Pattern Recognition Letters, 2005, 26(9): 1275-1291 [8] Chen Duo, Li Xue, Cui Duwu, et al. Cluster Validity Function Based on Fuzzy Degree. Pattern Recognition and Artificial Intelligence, 2008, 21(1): 34-41 (in Chinese) (陈 舵,李 雪,崔杜武,等.一种基于模糊度的聚类有效性函数.模式识别与人工智能, 2008, 21(1): 34-41) [9] Fukuyama Y, Sugeno M. A New Method of Choosing the Number of Clusters for the Fuzzy C-Means Method // Proc of the 5th Fuzzy Systems Symposium. Kobe, Japan, 1989: 247-250 [10] Pakhira M K, Bandyopadhyay S, Maulik U. Validity Index for Crisp and Fuzzy Clusters. Pattern Recognition, 2004, 37(3): 487-501 [11] Zahid N, Limouri M, Essaid A. A New Cluster-Validity for Fuzzy Clustering. Pattern Recognition, 1999, 32(7): 1089-1097 [12] Gong Gaiyun. Clustering Validity Function Based on Partition Fuzzy Degree. Pattern Recognition and Artificial Intelligence, 2004, 17(4): 412-416 (in Chinese) (宫改云. 基于划分模糊度的聚类有效性函数. 模式识别与人工智能, 2004, 17(4): 412-416) [13] Kim Y I, Kim D W, Lee D, et al. A Cluster Validation for GK Cluster Analysis Based on Relative Degree of Sharing. Information Science, 2004, 168(1/2/3/4): 225-242 [14] Asuncion A, Newman D J. UCI Repository of Machine Learning Database [DB/OL]. [2009-10-20]. http://www.ics.uci.edu/~mlearn/MLRepository.html [15] Pal N R, Bezedk J C. On Cluster Validity for the Fuzzy C-Means Model. IEEE Trans on Fuzzy Systems, 1995, 3(3): 370-379 |
|
|
|