Abstract:The performance of soft subspace clustering depends on the objective function and subspace search strategy, and cluster validity analysis is the main indicator of its performance. Aiming at the subspace clustering performance, a soft subspace clustering algorithm based on particle swarm optimization (SC-PSO) is proposed. Firstly, combining inter-cluster separation with feature weight based on K means-type clustering framework, a fuzzy weighting soft subspace objective function is designed. Then, particle swarm optimization with inertia weight is used as a subspace search strategy to jump out of the local optimum. Finally, the optimal cluster number is selected by the proposed cluster validity function.The experimental results demonstrate that SC-PSO improves the clustering accuracy and automatically determines the optimal cluster number.
[1] Sardana M, Agrawal R K. A Comparative Study of Clustering Methods for Relevant Gene Selection in Microarray Data // Proc of the 2nd International Conference on Computer Science, Engineering and Applications. New Delhi, India, 2012, I: 789-797 [2] Tsai C Y, Chiu C C. Developing a Feature Weight Self-adjustment Mechanism for a K-means Clustering Algorithm. Computational Statistics & Data Analysis, 2008, 52(10): 4658-4672 [3] Jing L P, Ng M K, Huang J Z, et al. An Entropy Weighting k-means Algorithm for Subspace Clustering of High-Dimensional Sparse Data. IEEE Trans on Knowledge and Data Engineering, 2007, 19(8): 1026-1041 [4] Craenen B G W, Nandi A K, Ristaniemi T. A Novel Heuristic Memetic Clustering Algorithm // Proc of the IEEE International Workshop on Machine Learning for Signal Processing. Southampton, UK, 2013. DOI: 10.1109/MLSP.2013.6661984 [5] Pelleg D, Moore A W. X-means: Extending k-means with Efficient Estimation of the Number of Clusters // Proc of the 17th International Conference on Machine Learning. Palo Alto, USA, 2002: 727-734 [6] Fouchal S, Ahat M, Amor S B, et al. Competitive Clustering Algorithms Based on Ultrametric Properties. Journal of Computational Science, 2013, 4(4): 219-231 [7] Chen X J, Xu X F, Huang J Z, et al. TW-k-means: Automated Two-Level Variable Weighting Clustering Algorithm for Multi-view Data. IEEE Trans on Knowledge and Data Engineering, 2013, 25(4): 932-944 [8] Makarenkov V, Legendre P. Optimal Variable Weighting for Ultrametric and Additive Trees and k-means Partitioning: Methods and Software. Journal of Classification, 2001, 18(2): 245-271 [9] Huang J Z, Ng M K, Rong H Q, et al. Automated Variable Weighting in k-means Type Clustering. IEEE Trans on Pattern Analysis and Machine Intelligence, 2005, 27(5): 657-668 [10] Song Q B, Ni J J, Wang G T. A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data. IEEE Trans on Knowledge and Data Engineering, 2011, 25(1): 1-14 [11] Bharill N, Tiwari A. Enhanced Cluster Validity Index for the Eva-luation of Optimal Number of Clusters for Fuzzy C-means Algorithm // Proc of the IEEE International Conference on Fuzzy Systems. Beijing, China, 2014: 1526-1533 [12] Huang H C, Chuang Y Y, Chen C S. Multiple Kernel Fuzzy Clustering. IEEE Trans on Fuzzy Systems, 2011, 20 (1): 120-134 [13] Jia J H, Xiao X, Liu B X. Similarity-Based Spectral Clustering Ensemble Selection // Proc of the 9th International Conference on Fuzzy Systems and Knowledge Discovery. Chongqing, China, 2012: 1071-1074 [14] Deng Z H, Choi K S, Chung F L, et al. Enhanced Soft Subspace Clustering Integrating Within-Cluster and Between-Cluster Information. Pattern Recognition, 2010, 43(3): 767-781 [15] Chen X J, Ye Y M, Xu X F, et al. A Feature Group Weighting Method for Subspace Clustering of High-Dimensional Data. Pattern Recognition, 2012, 45(1): 434-446 [16] Gan G, Wu J. A Convergence Theorem for the Fuzzy Subspace Clustering(FSC) Algorithm. Pattern Recognition, 2008, 41(6): 1939-1947 [17] Nidhal S, Ali M A M, Najah H. A Novel Cardiotocography Fetal Heart Rate Baseline Estimation Algorithm. Scientific Research and Essays, 2010, 5(24): 4002-4010 [18] Cai D, He X F, Han J W. Semi-supervised Discriminant Analysis // Proc of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil, 2007. DOI: 10.1109/ICCV.2007.4408856 [19] Selim S Z, Ismail M A.K-means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality. IEEE Trans on Pattern Analysis and Machine Intelligence, 1984, 6(1): 81-87 [20] Chan E Y, Ching W K, Ng M K, et al. An Optimization Algorithm for Clustering Using Weighted Dissimilarity Measures. Pattern Re-cognition, 2004, 37(5): 943-952 [21] Yang M S, Wu K L, Yu J. A Novel Fuzzy Clustering Algorithm // Proc of the IEEE International Symposium on Computational Intelligence in Robotics and Automation. Kobe, Japan, 2003, II: 647-652 [22] Wang J, Wang S T, Deng Z H. Survey on Challenges in Clustering Analysis Research. Control and Decision, 2012, 27(3): 321-328 (in Chinese) (王 骏,王士同,邓赵红.聚类分析研究中的若干问题.控制与决策, 2012, 27(3): 321-328) [23] Grira N, Crucianu M, Boujemaa N. Active Semi-supervised Fuzzy Clustering. Pattern Recognition, 2008, 41(5): 1834-1844 [24] Winkler R, Klawonn F, Kruse R. Fuzzy c-means in High Dimensional Spaces. International Journal of Fuzzy System Applications, 2011, 1(1): 1-16 [25] Lei K Y, Qiu Y H. A Study of Constrained Layout Optimization Using Adaptive Particle Swarm Optimizer. Journal of Computer Research and Development, 2006, 43(10): 1724-1731 (in Chinese) (雷开友,邱玉辉.基于自适应粒子群算法的约束布局优化研究.计算机研究与发展, 2006, 43(10): 1724-1731) [26] Jeng J T, Chuang C C, Tao C W. Interval Competitive Agglomeration Clustering Algorithm. Expert Systems with Applications, 2010, 37(9): 6567-6578 [27] Xie X L, Beni G. A Validity Measure for Fuzzy Clustering. IEEE Trans on Pattern Analysis and Machine Intelligence, 1991, 13(8): 841-847 [28] Wang Q, Ye Y M, Huang J Z. Fuzzy K-means with Variable Weighting in High Dimensional Data Analysis // Proc of the 9th International Conference on Web-Age Information Management. Zhangjiajie, China, 2008: 365-372 [29] Lu Y P, Wang S R, Li S Z, et al. Particle Swarm Optimizer for Variable Weighting in Clustering High-Dimensional Data. Machine Learning, 2011, 82(1): 43-70 [30] Frigui H, Krishnapuram R. A Robust Competitive Clustering Algorithm with Applications in Computer Vision. IEEE Trans on Pa-ttern Analysis and Machine Intelligence, 2004, 21(5): 450-465 [31] Rand W M. Objective Criteria for the Evaluation of Clustering Methods. Journal of the American Statistical Association, 1971, 66(336): 846-850 [32] Strehl A, Chosh J. Cluster Ensembles: A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research, 2002, 3: 583-617