Since most Nyström methods have problems of unstable clustering effect and weak representativeness in spectral clustering application,a spectral clustering algorithm based on weighted ensemble Nyström sampling is proposed. Firstly, the statistical leverage score is used to distinguish the importance of data and the data are weighted. Then, based on these weights, the weighted K-means center point sampling is used to obtain multiple sets of sampling points. The integration framework is introduced, and the approximate kernel matrix is constructed using the cluster parallel operation Nyström method. Finally, the approximate kernel is determined by the ridge regression method. The matrices are combined to produce a more accurate low rank approximation than that by standard Nyström method. Experiments on UCI datasets demonstrate that the proposed algorithm achieves better clustering results.
[1] MEI J P, WANG Y T, CHEN L H, et al. Large Scale Document Categorization with Fuzzy Clustering. IEEE Transactions on Fuzzy Systems, 2017, 25(5): 1239-1251.
[2] 邱云飞,费博雯,刘大千.基于概率模型的重叠子空间聚类算法.模式识别与人工智能, 2017, 30(7): 609-621.
(QIU Y F, FEI B W, LIU D Q. Overlapping Subspace Clustering Based on Probabilistic Model. Pattern Recognition and Artificial Intelligence, 2017, 30(7): 609-621.)
[3] 叶 茂,刘文芬.基于快速地标采样的大规模谱聚类算法.电子与信息学报, 2017, 39(2): 278-284.
(YE M, LIU W F. Large Scale Spectral Clustering Based on Fast Landmark Sampling. Journal of Electronics and Information Techno-logy, 2017, 39(2): 278-284.)
[4] 周 林,平西建,徐 森,等.基于谱聚类的聚类集成算法.自动化学报, 2012, 38(8): 1335-1342.
(ZHOU L, PING X J, XU S, et al. Cluster Ensemble Based on Spectral Clustering. Acta Automatica Sinica, 2012, 38(8): 1335-1342.)
[5] 丁世飞,贾洪杰,史忠植.基于自适应Nyström采样的大数据谱聚类算法.软件学报, 2014, 25(9): 2037-2049.
(DING S F, JIA H J, SHI Z Z. Spectral Clustering Algorithm Based on Adaptive Nyström Sampling for Big Data Analysis. Journal of Software, 2014, 25(9): 2037-2049.)
[6] WILLIAMS C K I, SEEGER M. Using the Nyström Method to Speed up Kernel Machines // Proc of the 13th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2000: 661-667.
[7] FOWLKES C, BELONGIE S, CHUNG F, et al. Spectral Grouping Using the Nyström Method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 214-225.
[8] WILLIAMS C K I, SEEGER M. The Effect of the Input Density Distribution on Kernel-Based Classifiers // Proc of the 17th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann Publishers, 2000: 1159-1166.
[9] DRINEAS P, MAHONEY M W. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning. Journal of Machine Learning Research, 2005, 6: 2153-2175.
[10] OUIMET M, BENGIO Y. Greedy Spectral Embedding[C/OL]. [2018-12-12]. http://www.gatsby.ucl.ac.uk/aistats/fullpapers/209.pdf.
[11] ZHANG K, TSANG I W, KWOK J T. Improved Nyström Low-Rank Approximation and Error Analysis // Proc of the 25th International Conference on Machine Learning. New York, USA: ACM, 2008: 1232-1239.
[12] ZHANG K, KWOK J T. Clustered Nyström Method for Large Scale Manifold Learning and Dimension Reduction. IEEE Transactions on Neural Networks, 2010, 21(10): 1576-1587.
[13] KUMAR S, MOHRI M, TALWALKAR A. Sampling Methods for the Nyström Method. Journal of Machine Learning Research, 2012, 13: 981-1006.
[14] SHI J B, MALIK J. Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
[15] BOUTSIDIS C, MAHONEY M W, DRINEAS P. An Improved Approximation Algorithm for the Column Subset Selection Problem // Proc of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms. New York, USA: ACM, 2009: 968-977.
[16] 唐 伟,周志华.基于Bagging的选择性聚类集成.软件学报, 2005, 16(4): 496-502.
(TANG W, ZHOU Z H. Bagging-Based Selective Cluster Ensemble. Journal of Software, 2005, 16(4): 496-502.)
[17] FERN X Z, LIN W. Cluster Ensemble Selection. Statistical Analysis and Data Mining, 2008, 1(3): 128-141.
[18] 刘展杰,陈晓云.局部子空间聚类.自动化学报, 2016, 42(8): 1238-1247.
(LIU Z J, CHEN X Y. Local Subspace Clustering. Acta Automatica Sinica, 2016, 42(8): 1238-1247.)
[19] 邱云飞,杨 倩,唐晓亮.基于粒子群优化的软子空间聚类算法.模式识别与人工智能, 2015, 28(10): 903-912.
(QIU Y F, YANG Q, TANG X L. Soft Subspace Clustering Based on Particle Swarm Optimization. Pattern Recognition and Artificial Intelligence, 2015, 28(10): 903-912.)
[20] ZHANG X C, YOU Q Z. Clusterability Analysis and Incremental Sampling for Nyström Extension Based Spectral Clustering // Proc of the 11th IEEE International Conference on Data Mining. Wa-shington, USA: IEEE, 2011: 942-951.