快速可扩展的子空间聚类算法<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201601002

摘要
图/表
参考文献
相关文章 (6)

全文: PDF (1268 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要子空间聚类算法只能处理小规模数据，且无法处理样本外数据.针对此问题，文中提出采用二次采样策略的子空间聚类框架(TSSC).该框架由两个核心部件组成:判别性协作表示(DCR)与多尺度K近邻(KNN)采样方法.在TSSC中，DCR首先结合多尺度KNN对数据点进行特征变换，从而保证属于同一子空间的点有更一致的表示.为了提高算法的可扩展性，TSSC在新的特征空间中使用多尺度KNN对数据进行二次采样，并根据采样点获得的初步聚类结果训练线性分类器，最后根据学习得到的分类器对剩余样本点进行分类，获得最终的聚类结果.在真实数据集上的实验验证TSSC的有效性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘博
	谢博鋆
	朱杰
	景丽萍
	于剑

关键词 ：子空间聚类, 自表示, 判别性协作表示, 多尺度K近邻

Abstract：Most existing subspace clustering methods are inefficient for large scale datasets and are unable to handle out-of-sample data. To address these problems, a framework is proposed called two-stage sample selection for subspace clustering (TSSC). TSSC consists of two key components: discriminative collaborative representation (DCR) and multi-scale K nearest neighbors (KNN). DCR is combined with multi-scale KNN for feature mapping, and thus the samples belonging to the same subspace have more consistent representation. To enhance the scalability of the algorithm, multi-scale KNN is reused to select some information points from the new feature space by TSSC. Then, a linear classifier is trained according to the clustering result produced by the pre-selected points. Finally, the rest samples are categorized to obtain the final clustering result. Experiments conducted on the real-world datasets verify the effectiveness of TSSC.

Key words： Subspace Clustering Self-expression Discriminative Collaborative Representation Multi-scale K Nearest Neighbors

收稿日期: 2015-05-13

ZTFLH:

TP 301

基金资助:国家自然科学基金项目(No.61370129,61375062)、教育部高等学校博士学科点专项科研基金项目(No.20120009110006)、长江学者和创新团队发展计划(No.IRT201206)、中央高校基本科研业务费基金项目(No.2014JBM029)资助

作者简介: 刘博，男，1981年生，博士研究生，主要研究方向为子空间学习、半监督学习、人脸识别.E-mail:liubohbu@126.com.谢博鋆，男，1981年生，博士研究生，主要研究方向为机器学习、计算机视觉.E-mail:8054499@qq.com.朱杰，男，1982年生，博士研究生，主要研究方向为机器学习、对象识别、图像分类.E-mail:13633589@qq.com.景丽萍(通讯作者)，女，1978年生，博士，教授，主要研究方向为数据挖掘、文本挖掘、生物信息学、企业智能.E-mail:lipingjing@bjtu.edu.cn.于剑，男，1969年生，博士，教授，主要研究方向为聚类分析、图像处理.E-mail:jianyu@bjtu.edu.cn.

引用本文:

刘博，谢博鋆，朱杰，景丽萍，于剑. 快速可扩展的子空间聚类算法^*[J]. 模式识别与人工智能, 2016, 29(1): 11-21. LIU Bo, XIE Bojun, ZHU Jie, JING Liping, YU Jian. Fast Scalable Subspace Clustering Algorithm. , 2016, 29(1): 11-21.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201601002 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2016/V29/I1/11

[1] BASRI R, JACOBS D. Lambertian Reflectance and Linear Subspaces. IEEE Trans on Pattern Analysis and Machine Intelligence, 2003, 25(2): 218-233.
[2] COSTEIRA J P, KANADE T. A Multibody Factorization Method for Independently Moving Objects. International Journal of Computer Vision, 1998, 29(3): 159-179.
[3] ROWEIS S T, SAUL L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000, 290(5500): 2323-2326.
[4] WRIGHT J, YANG A Y, GANESH A, et al. Robust Face Recognition via Sparse Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227.
[5] KANATANI K. Motion Segmentation by Subspace Separation and Model Selection // Proc of the 8th IEEE International Conference on Computer Vision. Vancouver, Canada, 2001, II: 586-591.
[6] VIDAL R. Subspace Clustering. IEEE Signal Processing Magazine, 2011, 28(2): 52-68.
[7] BOULT T E, BROWN L G. Factorization-Based Segmentation of Motions // Proc of the IEEE Workshop on Visual Motion. Princeton, USA, 1991: 179-186.
[8] TSENG P. Nearest q-Flat to m Points. Journal of Optimization Theory and Applications, 2000, 105(1): 249-252.
[9] SUGAYA Y, KANATANI K. Geometric Structure of Degeneracy for Multi-body Motion Segmentation // Proc of the ECCV Workshop on Statistical Methods in Video Processing. Prague, Czech Republic, 2004: 13-25.
[10] ELHAMIFAR E, VIDAL R. Sparse Subspace Clustering: Algorithm, Theory, and Applications. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(11): 2765-2781.
[11] LIU G C, LIN Z C, YAN S C, et al. Robust Recovery of Subspace Structures by Low-Rank Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(1): 171-184.
[12] LU C Y, FENG J S, LIN Z C, et al. Correlation Adaptive Subspace Segmentation by Trace Lasso // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 1345-1352.
[13] PHAM D S, BUDHADITYA S, PHUNG D, et al. Improved Subspace Clustering via Exploitation of Spatial Constraints // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Providence, USA, 2012: 550-557.
[14] LU C Y, MIN H, ZHAO Z Q, et al. Robust and Efficient Subspace Segmentation via Least Squares Regression // Proc of the 12th European Conference on Computer Vision. Florence, Italy, 2012, VII: 347-360.
[15] SAHA B, PHAM D S, PHUNG D, et al. Sparse Subspace Clus-tering via Group Sparse Coding // Proc of the 13th SIAM International Conference on Data Mining. Austin, USA, 2013: 130-138.
[16] HU H, LIN Z C, FENG J J, et al. Smooth Representation Clus-tering // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Columbus, USA, 2014: 3834-3841.
[17] VIDAL R, FAVARO P. Low Rank Subspace Clustering (LRSC). Pattern Recognition Letters, 2014, 43: 47-61.
[18] PENG X, Tang H J, Zhang L, et al. A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data. IEEE Trans on Neural Networks and Learning Systems, 2015. DOI: 10.1109/TNNLS.2015.2490080.
[19] TALWALKAR A, MACKEY L, MU Y D, et al. Distributed Low-Rank Subspace Segmentation // Proc of the IEEE International Conference on Computer Vision. Sydney, Australia, 2013: 3543-3550.
[20] CHEN X L, CAI D. Large Scale Spectral Clustering with Landmark-Based Representation // Proc of the 25th AAAI Conference on Artificial Intelligence. San Francisco, USA, 2011: 313-318.
[21] NEI F P, ZENG Z N, TSANG I W, et al. Spectral Embedded Clustering: A Framework for In-sample and Out-of-sample Spectral Clustering. IEEE Trans on Neural Networks, 2011, 22(11): 1796-1808.
[22] CAI D, HE X F, HAN J W, et al. Graph Regularized Non-negative Matrix Factorization for Data Representation. IEEE Trans on Pattern Analysis and Machine Intelligence, 2010, 33(8): 1548-1560.
[23] ZHANG L, YANG M, FENG X C. Sparse Representation or Co-llaborative Representation: Which Helps Face Recognition // Proc of the IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 471-478.
[24] SETTLES B. Active Learning Literature Survey. Technical Report, 1648. Madison, USA: University of Wisconsin, 2010.
[25] NG A Y, JORDAN M I, WEISS Y. On Spectral Clustering: Ana-lysis and an Algorithm // DIETTERICH T G, BECKER S, GHAHRAMANI Z, eds. Advances in Neural Information Proce-ssing Systems 14. Cambridge, USA: MIT Press, 2002: 849-856.