LIU Bo1,2, XIE Bojun1,3, ZHU Jie1,4, JING Liping1, YU Jian1
1.Beijing Key Laboratory of Traffic Data Analysis and Mining, School of Computer and Information Technology,Beijing Jiaotong University, Beijing 100044 2.College of Information Science and Technology, Agricultural University of Hebei, Baoding 071001 3.Hebei Province Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding 071002 4.Department of Information Management, The National Police University for Criminal Justice, Baoding 071000
Abstract:Most existing subspace clustering methods are inefficient for large scale datasets and are unable to handle out-of-sample data. To address these problems, a framework is proposed called two-stage sample selection for subspace clustering (TSSC). TSSC consists of two key components: discriminative collaborative representation (DCR) and multi-scale K nearest neighbors (KNN). DCR is combined with multi-scale KNN for feature mapping, and thus the samples belonging to the same subspace have more consistent representation. To enhance the scalability of the algorithm, multi-scale KNN is reused to select some information points from the new feature space by TSSC. Then, a linear classifier is trained according to the clustering result produced by the pre-selected points. Finally, the rest samples are categorized to obtain the final clustering result. Experiments conducted on the real-world datasets verify the effectiveness of TSSC.
[1] BASRI R, JACOBS D. Lambertian Reflectance and Linear Subspaces. IEEE Trans on Pattern Analysis and Machine Intelligence, 2003, 25(2): 218-233. [2] COSTEIRA J P, KANADE T. A Multibody Factorization Method for Independently Moving Objects. International Journal of Computer Vision, 1998, 29(3): 159-179. [3] ROWEIS S T, SAUL L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000, 290(5500): 2323-2326. [4] WRIGHT J, YANG A Y, GANESH A, et al. Robust Face Recognition via Sparse Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227. [5] KANATANI K. Motion Segmentation by Subspace Separation and Model Selection // Proc of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada, 2001, II: 586-591. [6] VIDAL R. Subspace Clustering. IEEE Signal Processing Magazine, 2011, 28(2): 52-68. [7] BOULT T E, BROWN L G. Factorization-Based Segmentation of Motions // Proc of the IEEE Workshop on Visual Motion. Princeton, USA, 1991: 179-186. [8] TSENG P. Nearest q-Flat to m Points. Journal of Optimization Theory and Applications, 2000, 105(1): 249-252. [9] SUGAYA Y, KANATANI K. Geometric Structure of Degeneracy for Multi-body Motion Segmentation // Proc of the ECCV Workshop on Statistical Methods in Video Processing. Prague, Czech Republic, 2004: 13-25. [10] ELHAMIFAR E, VIDAL R. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765-2781. [11] LIU G C, LIN Z C, YAN S C, et al. Robust Recovery of Subspace Structures by Low-Rank Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171-184. [12] LU C Y, FENG J S, LIN Z C, et al. Correlation Adaptive Subspace Segmentation by Trace Lasso // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 1345-1352. [13] PHAM D S, BUDHADITYA S, PHUNG D, et al. Improved Subspace Clustering via Exploitation of Spatial Constraints // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 550-557. [14] LU C Y, MIN H, ZHAO Z Q, et al. Robust and Efficient Subspace Segmentation via Least Squares Regression // Proc of the 12th European Conference on Computer Vision. Florence, Italy, 2012, VII: 347-360. [15] SAHA B, PHAM D S, PHUNG D, et al. Sparse Subspace Clus-tering via Group Sparse Coding // Proc of the 13th SIAM International Conference on Data Mining. Austin, USA, 2013: 130-138. [16] HU H, LIN Z C, FENG J J, et al. Smooth Representation Clus-tering // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Columbus, USA, 2014: 3834-3841. [17] VIDAL R, FAVARO P. Low Rank Subspace Clustering (LRSC). Pattern Recognition Letters, 2014, 43: 47-61. [18] PENG X, Tang H J, Zhang L, et al. A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data. IEEE Trans on Neural Networks and Learning Systems, 2015. DOI: 10.1109/TNNLS.2015.2490080. [19] TALWALKAR A, MACKEY L, MU Y D, et al. Distributed Low-Rank Subspace Segmentation // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 3543-3550. [20] CHEN X L, CAI D. Large Scale Spectral Clustering with Landmark-Based Representation // Proc of the 25th AAAI Conference on Artificial Intelligence. San Francisco, USA, 2011: 313-318. [21] NEI F P, ZENG Z N, TSANG I W, et al. Spectral Embedded Clustering: A Framework for In-sample and Out-of-sample Spectral Clustering. IEEE Trans on Neural Networks, 2011, 22(11): 1796-1808. [22] CAI D, HE X F, HAN J W, et al. Graph Regularized Non-negative Matrix Factorization for Data Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2010, 33(8): 1548-1560. [23] ZHANG L, YANG M, FENG X C. Sparse Representation or Co-llaborative Representation: Which Helps Face Recognition // Proc of the IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 471-478. [24] SETTLES B. Active Learning Literature Survey. Technical Report, 1648. Madison, USA: University of Wisconsin, 2010. [25] NG A Y, JORDAN M I, WEISS Y. On Spectral Clustering: Ana-lysis and an Algorithm // DIETTERICH T G, BECKER S, GHAHRAMANI Z, eds. Advances in Neural Information Proce-ssing Systems 14. Cambridge, USA: MIT Press, 2002: 849-856.