Interval Type-2 Fuzzy Measure Based Rough K-means Clustering
LU Ruiqiang1, MA Fumin1, ZHANG Tengfei2
1.College of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023
2.College of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210023
The rough k-means algorithm and its derivatives focus on the description of data objects in uncertain boundary areas. However, the influence of imbalanced sizes between clusters on the clustering result is ignored. The interval type-2 fuzzy measure is introduced in this paper for measuring the boundary objects and an improved rough K-means clustering algorithm is developed. Firstly, the membership degree interval of the boundary object is calculated according to the data distribution of clusters and thus the spatial distribution of clusters is described. Then, the data sample size of the cluster is taken into account to adaptively adjust the influence coefficient of boundary objects on overlapping clusters. The experimental results on both synthetic and UCI datasets show that the adverse impact of the boundary objects on the means iterative calculations of small sample size clusters is mitigated and the clustering accuracy is improved.
[1] HAN J W, KAMBER M, PEI J. Data Mining: Concepts and Techniques. 3rd Edition. San Francisco, USA: Morgan Kaufmann Publishers, 2011.
[2] LINGRAS P, WEST C. Interval Set Clustering of Web Users with Rough K-means. Journal of Intelligent Information Systems, 2004, 23(1): 5-16.
[3] PETERS G. Some Refinements of Rough K-means Clustering. Pa-ttern Recognition, 2006, 39(8): 1481-1491.
[4] 张腾飞,陈 龙,李 云.基于簇内不平衡度量的粗糙K-means聚类算法.控制与决策, 2013, 28(10): 1479-1484.
(ZHANG T F, CHEN L, LI Y. Rough K-means Clustering Based on Unbalanced Degree of Cluster. Control and Decision, 2013, 28(10): 1479-1484.)
[5] ZHANG T F, CHEN L, MA F M. A Modified Rough C-means Clustering Algorithm Based on Hybrid Imbalanced Measure of Distance and Density. International Journal of Approximate Reasoning, 2014, 55(8): 1805-1818.
[6] 李 莲,罗 可,周博翔.基于粒计算的粗糙集聚类算法.计算机应用研究, 2013, 30(10): 2916-2919.
(LI L, LUO K, ZHOU B X. Rough Clustering Algorithm Based on Granular Computing. Application Research of Computers, 2013, 30(10): 2916-2919.)
[7] MITRA S, BANKA H, PEDRYCZ W. Rough Fuzzy Collaborative Clustering. IEEE Transactions on Systems, Man, and Cybernetics(Cybernetics), 2006, 36(4): 795-805.
[8] MAJI P, PAL S K. RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets. Fundamenta Informaticae, 2007, 80(4): 475-496.
[9] MENDEL J M, John R I B. Type-2 Fuzzy Sets Made Simple. IEEE Transactions on Fuzzy Systems, 2002, 10(2): 117-127.
[10] RHEE F C H, HWANG C. A Type-2 Fuzzy C-means Clustering Algorithm // Proc of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference. Washington, USA: IEEE, 2001: 1926-1929.
[11] ZARANDI M H F, ZARINBAL M, TRKSEN I B.Type-II Fuzzy Possibilistic C-means Clustering // Proc of the Joint International Fuzzy Systems Association World Congress and European Society of Fuzzy Logic and Technology Conference. Berlin, Germany: Springer, 2009: 30-35.
[12] RUBIO E, CASTILLO O. Interval Type-2 Fuzzy Clustering Algorithm Using the Combination of the Fuzzy and Possibilistic C-means Algorithms // Proc of the IEEE Conference on Norbert Wiener in the 21st Century. Washington, USA: IEEE, 2014. DOI: 10.1109/NORBERT.2014.6893879.
[13] ARABEGUM S, MEMA DEVI O. A Rough Type-2 Fuzzy Clus-tering Algorithm for MR Image Segmentation. International Journal of Computer Applications, 2012, 54(4): 4-11.
[14] SARKAR J P, SAHA I, MAULIK U. Rough Possibilistic Type-2 Fuzzy C-means Clustering for MR Brain Image Segmentation. Applied Soft Computing, 2016, 46: 527-536.
[15] SARKAR J P, SAHA I, MAULIK U. A New SVM Integrated Rough Type-II Fuzzy Clustering Technique // Proc of the 9th International Conference on Industrial and Information Systems. Washington, USA: IEEE, 2015. DOI: 10.1109/ICIINFS.2014.7036555.
[16] WU K L, YANG M S. Alternative C-means Clustering Algorithms. Pattern Recognition, 2002, 35(10): 2267-2278.
[17] LIU Y, HOU T, LIU F. Improving Fuzzy C-means Method for Unbalanced Dataset. Electronics Letters, 2015, 51(23): 1880-1882.
[18] TAHIR M A, KITTLER J, YAN F. Inverse Random under Sampling for Class Imbalance Problem and Its Application to Multi-label Classification. Pattern Recognition, 2012, 45(10): 3738-3750.
[19] LEE C Y, LEE Z J. A Novel Algorithm Applied to Classify Unba-lanced Data. Applied Soft Computing, 2012, 12(8): 2481-2485.
[20] HWANG C, RHEE F C H. Uncertain Fuzzy Clustering: Interval Type-2 Fuzzy Approach to C-means. IEEE Transactions on Fuzzy Systems, 2007, 15(1): 107-120.
[21] LIU F L, MENDEL J M. Encoding Words into Interval Type-2 Fuzzy Sets Using an Interval Approach. IEEE Transactions on Fuzzy Systems, 2008, 16(6): 1503-1521.
[22] PETERS G. Is There Any Need for Rough Clustering? Pattern Re-cognition Letters, 2015, 53: 31-37.