Abstract:The mistakes in fuzzy clustering can be categorized into two types: classifying data originated from different classes into one cluster and classifying data originated from the same class into different clusters. In this paper, intra-class non-consistency and inter-class overlapping are defined to measure the two kinds of mistakes respectively. A good fuzzy partition is expected to have few clustering mistakes and large compactness. Based on the two mistake measures and cluster compactness, a cluster validity index is proposed to evaluate the clustering results. Experimental results show the effectiveness and the robustness of the proposed validity index in determining optimal number of clusters.
贲圣兰,苏光大. 基于错误度量的模糊聚类有效性函数[J]. 模式识别与人工智能, 2010, 23(1): 11-16.
BEN Sheng-Lan,SU Guang-Da. A Fuzzy Cluster Validity Index Based on Clustering Mistake Measures. , 2010, 23(1): 11-16.
[1] Bezdek J C. Numerical Taxonomy with Fuzzy Sets. Journal of Mathematical Biology, 1974, 1(1): 57-71 [2] Bezdek J C. Cluster Validity with Fuzzy Sets. Journal of Cybernetics, 1974, 3(3): 58-72 [3] Xie X L, Beni G. A Validity Measure for Fuzzy Clustering. IEEE Trans on Pattern Analysis and Machine Intelligence, 1991, 13(8): 841-847 [4] Wang Weina, Zhang Yunjie. On Fuzzy Cluster Validity Indices. Fuzzy Sets and Systems, 2007, 158(19): 2095-2117 [5] Dave R N. Validating Fuzzy Partition Obtained through c-Shells Clustering. Pattern Recognition Letters, 1996, 17(6): 613-623 [6] Chen Minyou, Linkens D A. Rule-Base Self-Generation and Simplification for Data-Driven Fuzzy Models. Fuzzy Sets and Systems, 2004, 142(1): 243-265 [7] Wu K L, Yang M S. A Cluster Validity Index for Fuzzy Clustering. Pattern Recognition Letters, 2005, 26(9): 1275-1291 [8] Chen Duo, Li Xue, Cui Duwu, et al. Cluster Validity Function Based on Fuzzy Degree. Pattern Recognition and Artificial Intelligence, 2008, 21(1): 34-41 (in Chinese) (陈 舵,李 雪,崔杜武,等.一种基于模糊度的聚类有效性函数.模式识别与人工智能, 2008, 21(1): 34-41) [9] Fukuyama Y, Sugeno M. A New Method of Choosing the Number of Clusters for the Fuzzy C-Means Method // Proc of the 5th Fuzzy Systems Symposium. Kobe, Japan, 1989: 247-250 [10] Pakhira M K, Bandyopadhyay S, Maulik U. Validity Index for Crisp and Fuzzy Clusters. Pattern Recognition, 2004, 37(3): 487-501 [11] Zahid N, Limouri M, Essaid A. A New Cluster-Validity for Fuzzy Clustering. Pattern Recognition, 1999, 32(7): 1089-1097 [12] Gong Gaiyun. Clustering Validity Function Based on Partition Fuzzy Degree. Pattern Recognition and Artificial Intelligence, 2004, 17(4): 412-416 (in Chinese) (宫改云. 基于划分模糊度的聚类有效性函数. 模式识别与人工智能, 2004, 17(4): 412-416) [13] Kim Y I, Kim D W, Lee D, et al. A Cluster Validation for GK Cluster Analysis Based on Relative Degree of Sharing. Information Science, 2004, 168(1/2/3/4): 225-242 [14] Asuncion A, Newman D J. UCI Repository of Machine Learning Database [DB/OL]. [2009-10-20]. http://www.ics.uci.edu/~mlearn/MLRepository.html [15] Pal N R, Bezedk J C. On Cluster Validity for the Fuzzy C-Means Model. IEEE Trans on Fuzzy Systems, 1995, 3(3): 370-379