|
|
Fast Scalable Subspace Clustering Algorithm |
LIU Bo1,2, XIE Bojun1,3, ZHU Jie1,4, JING Liping1, YU Jian1 |
1.Beijing Key Laboratory of Traffic Data Analysis and Mining, School of Computer and Information Technology,Beijing Jiaotong University, Beijing 100044 2.College of Information Science and Technology, Agricultural University of Hebei, Baoding 071001 3.Hebei Province Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding 071002 4.Department of Information Management, The National Police University for Criminal Justice, Baoding 071000 |
|
|
Abstract Most existing subspace clustering methods are inefficient for large scale datasets and are unable to handle out-of-sample data. To address these problems, a framework is proposed called two-stage sample selection for subspace clustering (TSSC). TSSC consists of two key components: discriminative collaborative representation (DCR) and multi-scale K nearest neighbors (KNN). DCR is combined with multi-scale KNN for feature mapping, and thus the samples belonging to the same subspace have more consistent representation. To enhance the scalability of the algorithm, multi-scale KNN is reused to select some information points from the new feature space by TSSC. Then, a linear classifier is trained according to the clustering result produced by the pre-selected points. Finally, the rest samples are categorized to obtain the final clustering result. Experiments conducted on the real-world datasets verify the effectiveness of TSSC.
|
Received: 13 May 2015
|
|
Fund:Supported by National Natural Science Foundation of China(No.61370129,61375062), Specializel Research Fund for the Doctoral Program of Higher Education of China(No.20120009110006), Program for Changjiang Scholars and Innovative Research Team in University(No.IRT201206), the Fundamental Research Funds for the Central Universities(No.2014JBM029) |
About author:: (LIU Bo, born in 1981, Ph.D. candidate. His research inte-rests include subspace clustering, semi-supervised learning and face recognition.)(XIE Bojun, born in 1981, Ph.D. candidate. His research interests include machine learning and computer vision.)(ZHU Jie, born in 1982, Ph.D. candidate. His research interests include machine learning, object identification and image classification.)(JING Liping (Corresponding author), born in 1978, Ph.D., professor. Her research interests include data mining, text mi-ning, bioinformatics and enterprise intelligence.)(YU Jian, born in 1969, Ph.D., professor. His research interests include clustering analysis and image processing.) |
|
|
|
[1] BASRI R, JACOBS D. Lambertian Reflectance and Linear Subspaces. IEEE Trans on Pattern Analysis and Machine Intelligence, 2003, 25(2): 218-233. [2] COSTEIRA J P, KANADE T. A Multibody Factorization Method for Independently Moving Objects. International Journal of Computer Vision, 1998, 29(3): 159-179. [3] ROWEIS S T, SAUL L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000, 290(5500): 2323-2326. [4] WRIGHT J, YANG A Y, GANESH A, et al. Robust Face Recognition via Sparse Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227. [5] KANATANI K. Motion Segmentation by Subspace Separation and Model Selection // Proc of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada, 2001, II: 586-591. [6] VIDAL R. Subspace Clustering. IEEE Signal Processing Magazine, 2011, 28(2): 52-68. [7] BOULT T E, BROWN L G. Factorization-Based Segmentation of Motions // Proc of the IEEE Workshop on Visual Motion. Princeton, USA, 1991: 179-186. [8] TSENG P. Nearest q-Flat to m Points. Journal of Optimization Theory and Applications, 2000, 105(1): 249-252. [9] SUGAYA Y, KANATANI K. Geometric Structure of Degeneracy for Multi-body Motion Segmentation // Proc of the ECCV Workshop on Statistical Methods in Video Processing. Prague, Czech Republic, 2004: 13-25. [10] ELHAMIFAR E, VIDAL R. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765-2781. [11] LIU G C, LIN Z C, YAN S C, et al. Robust Recovery of Subspace Structures by Low-Rank Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171-184. [12] LU C Y, FENG J S, LIN Z C, et al. Correlation Adaptive Subspace Segmentation by Trace Lasso // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 1345-1352. [13] PHAM D S, BUDHADITYA S, PHUNG D, et al. Improved Subspace Clustering via Exploitation of Spatial Constraints // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 550-557. [14] LU C Y, MIN H, ZHAO Z Q, et al. Robust and Efficient Subspace Segmentation via Least Squares Regression // Proc of the 12th European Conference on Computer Vision. Florence, Italy, 2012, VII: 347-360. [15] SAHA B, PHAM D S, PHUNG D, et al. Sparse Subspace Clus-tering via Group Sparse Coding // Proc of the 13th SIAM International Conference on Data Mining. Austin, USA, 2013: 130-138. [16] HU H, LIN Z C, FENG J J, et al. Smooth Representation Clus-tering // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Columbus, USA, 2014: 3834-3841. [17] VIDAL R, FAVARO P. Low Rank Subspace Clustering (LRSC). Pattern Recognition Letters, 2014, 43: 47-61. [18] PENG X, Tang H J, Zhang L, et al. A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data. IEEE Trans on Neural Networks and Learning Systems, 2015. DOI: 10.1109/TNNLS.2015.2490080. [19] TALWALKAR A, MACKEY L, MU Y D, et al. Distributed Low-Rank Subspace Segmentation // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 3543-3550. [20] CHEN X L, CAI D. Large Scale Spectral Clustering with Landmark-Based Representation // Proc of the 25th AAAI Conference on Artificial Intelligence. San Francisco, USA, 2011: 313-318. [21] NEI F P, ZENG Z N, TSANG I W, et al. Spectral Embedded Clustering: A Framework for In-sample and Out-of-sample Spectral Clustering. IEEE Trans on Neural Networks, 2011, 22(11): 1796-1808. [22] CAI D, HE X F, HAN J W, et al. Graph Regularized Non-negative Matrix Factorization for Data Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2010, 33(8): 1548-1560. [23] ZHANG L, YANG M, FENG X C. Sparse Representation or Co-llaborative Representation: Which Helps Face Recognition // Proc of the IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 471-478. [24] SETTLES B. Active Learning Literature Survey. Technical Report, 1648. Madison, USA: University of Wisconsin, 2010. [25] NG A Y, JORDAN M I, WEISS Y. On Spectral Clustering: Ana-lysis and an Algorithm // DIETTERICH T G, BECKER S, GHAHRAMANI Z, eds. Advances in Neural Information Proce-ssing Systems 14. Cambridge, USA: MIT Press, 2002: 849-856. |
|
|
|