Abstract:For a K clustering problem, Ng-Jordan-Weiss (NJW) spectral clustering method adopts the eigenvectors corresponding to the K largest eigenvalues of the normalized affinity matrix derived from a dataset as a novel representation of the original data. However, these K eigenvectors can not always reflect the structure of the original data for some pattern recognition problems. In this paper, a semi-supervised eigenvector selection method for spectral clustering is proposed. This method utilizes some amount of supervised information to search the eigenvector combination which can reflect the structure of the original data, and then obtains more satisfying performance than the classical spectral clustering algorithms. Experimental results on UCI benchmark datasets and MNIST handwritten digits datasets show that the proposed method is effective and robust.
[1] Fiedler M. Algebraic Connectivity of Graphs. Czechoslovak Mathematical Journal, 1973, 23(98): 298-305 [2] Hendrickson B, Leland R. An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations. SIAM Journal on Scientific Computing, 1995, 16(2): 452-469 [3] Hagen L, Kahng A B. New Spectral Methods for Ratio Cut Partitioning and Clustering. IEEE Trans on Computer-Aided Design, 1992, 11(9): 1074-1085 [4] Dhillon I S. Co-Clustering Documents and Words Using Bipartite Spectral Graph Partitioning // Proc of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). San Francisco, USA, 2001: 269-274 [5] Xu Sen, Lu Zhimao, Gu Guochang. Document Cluster Ensemble Algorithms Based on Matrix Spectral Analysis. Pattern Recognition and Artificial Intelligence, 2009, 22(5): 780-786 (in Chinese) (徐 森,卢志茂,顾国昌.基于矩阵谱分析的文本聚类集成算法.模式识别与人工智能, 2009, 22(5): 780-786) [6] Ding C, He Xiaofeng, Zha Hongyuan, et al. Unsupervised Learning: Self-Aggregation in Scaled Principal Component Space // Proc of the 6th European Conference on Principles of Data Mining and Knowledge Discovery. Helsinki, Finland, 2002: 112-124 [7] Shi Jiaobo, Malik J. Normalized Cuts and Image Segmentation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905 [8] Ng A Y, Jordan M I, Weiss Y. On Spectral Clustering: Analysis and an Algorithm // Dietterich T, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2002, XIV: 849-856 [9] Fowlkes C, Belongie S, Chung F, et al. Spectral Grouping Using the Nystrm Method. IEEE Trans on Pattern Analysis and Machine Intelligence, 2004, 26(2): 214-225 [10] Wang Ling, Bo Liefeng, Jiao Licheng. Density-Sensitive Semi-Supervised Spectral Clustering. Journal of Software, 2007, 18(10): 2412-2422 (in Chinese) (王 玲,薄列峰,焦李成.密度敏感的半监督谱聚类.软件学报, 2007, 18(10): 2412-2422) [11] Sun Z, Bebis G, Miller R. Object Detection Using Feature Subset Selection. Pattern Recognition, 2004, 37(11): 2165-2176 [12] Li Guozheng, Bu Hualong, Yang M Q, et al. Selecting Subsets of Newly Extracted Features from PCA and PLS in Microarray Data Analysis. BMC Genomics, 2008, 9(Z2): 24 [13] Xiang Tuo, Gong Shaogang. Spectral Clustering with Eigenvector Selection. Pattern Recognition, 2008, 41(3): 1012-1029 [14] Jiao Licheng, Du Haifeng, Liu Fang, et al. Immunological Computation for Optimization, Learning and Recognition. Beijing, China: Science Press, 2006 (in Chinese) (焦李成,杜海峰,刘 芳,等.免疫优化计算学习与识别.北京:科学出版社, 2006) [15] Zelnik-Manor L, Perona P. Self-Tuning Spectral Clustering // Saul L K, Weiss Y, Bottou L, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2004, XVII: 1601-1608 [16] Zhang Bin, Hsu M, Dayal U. K-Harmonic Means-A Spatial Clustering Algorithm with Boosting // Proc of the 1st International Workshop on Temporal, Spatial and Spatio-Temporal Data Mining. Lyon, France, 2000: 31-45 [17] Asuncion A, Newman D J. UCI Machine Learning Repository [DB/OL]. [2009-11-01]. http://www.ics.uci.edu/~mlearn/MLRepository.html [18] Lecun Y Bottou L, Bengio Y, et al. Gradient-Based Learning Applied to Document Recognition. Proc of the IEEE, 1998, 86(11): 2278-2324 [19] Wu Mingru, Schlkopf B. A Local Learning Approach for Clustering // Proc of the 20th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2007: 1529-1536 [20] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167 [21] Geng Xin, Zhan Dechuan, Zhou Zhihua. Supervised Nonlinear Dimensionality Reduction for Visualization and Classification. IEEE Trans on Systems, Man and Cybernetics, 2005, 35(6): 1098-1107