基于行列联合选择矩阵分解的偏好特征提取<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201703010

摘要
图/表
参考文献
相关文章 (4)

全文: PDF (1055 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对奇异值分解(SVD)分析偏好特征不够准确,有时出现不可解释的情况,文中提出利用行列联合选择(CUR)矩阵分解方法获取原始矩阵M(用户对产品的偏好)的低秩近似,提取用户和产品的潜在偏好.首先计算M中行和列的统计影响力得分,并抽取得分较高的若干列和若干行构成低维矩阵C和R,然后由M、C、R近似构造矩阵U,将高维空间中的偏好特征提取问题转化为低维空间中的矩阵分析问题,使其具有较好的可解释性和准确性.最后,通过理论分析和实验发现,与传统分解方法相比,CUR矩阵分解方法在偏好特征提取方面具有更高的准确度、更好的可解释性及更高的压缩率.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	雷恒鑫
	刘惊雷

关键词 ：行列联合选择(CUR)矩阵分解, 低秩近似, 偏好特征, 统计影响力得分, 可解释性

Abstract：Preference features can not be accurately analyzed and explained by singular value decomposition. Aiming at these problems, a column union row(CUR) matrix decomposition method is proposed to acquire a low-rank approximation of the original matrix M (user preferences for products) and extract the potential preferences of users and products. The statistics leverage score of matrix M is calculated firstly. And then, several rows and columns with higher scores are extracted to constitute low-dimensional matrix C and matrix R. Subsequently, the matrix U is constructed approximatively according to matrix M, C and R. By the proposed method, the extraction problem of preference feature in a high-dimensional space is transformed to the matrix analysis problem in a lower dimensional space. As a consequence, the CUR decomposition has better accuracy and interpretability. Finally, the theoretical analysis and experiment indicate that compared with the traditional decomposition methods, the CUR matrix decomposition method has higher accuracy, better interpretability and higher compression ratio for extracting preference feature.

Key words： Column Union Row(CUR) Matrix Decomposition Low Rank Approximation Preference Feature Statistical Leverage Score Interpretability

收稿日期: 2016-05-13

ZTFLH:

TP 181

基金资助:国家自然科学基金项目(No.61572419,61572418,61403328,61403329)、山东省自然科学基金项目( No.2015GSF115009,ZR2014FQ016,ZR2014FQ026,ZR2013FM011)资助

作者简介: 雷恒鑫,男,1993年生,硕士研究生,主要研究方向为矩阵分解方法在推荐系统和偏好特征提取中的应用.E-mail:leihengxin@msn.cn.
刘惊雷(通讯作者),男,1970年生,硕士,副教授,主要研究方向为人工智能、理论计算机科学.E-mail:jinglei_liu@sina.com.

引用本文:

雷恒鑫，刘惊雷. 基于行列联合选择矩阵分解的偏好特征提取^*[J]. 模式识别与人工智能, 2017, 30(3): 279-288. LEI Hengxin, LIU Jinglei. Preference Feature Extraction Based on Column Union Row Matrix Decomposition. , 2017, 30(3): 279-288.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201703010 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2017/V30/I3/279

[1] 朱锐,王怀民,冯大为.基于偏好推荐的可信服务选择.软件学报, 2011, 22(5): 852-864.
(ZHU R, WANG H M, FENG D W. Trustworthy Service Selection Based on Preference Recommendation. Journal of Software, 2011, 22(5): 852-864.)
[2] BUSA-FEKETE R, SZ R NYI B, WENG P, et al. Preference-based Reinforcement Learning: Evolutionary Direct Policy Search Using a Preference-Based Racing Algorithm. Machine Learning, 2014, 97(3): 327-351.
[3] FORSATI R, MAHDAVI M, SHAMSFARD M, et al. Matrix Factorization with Explicit Trust and Distrust Side Information for Improved Social Recommendation. ACM Transactions on Information Systems, 2014, 32(4). DOI: 10.1145/2641564.
[4] BOBADILLA J, ORTEGA F, HERNANDO A, et al. Recommender Systems Survey. Knowledge-Based Systems, 2013, 46: 109-132.
[5] 吴金龙.Netflix Prize中的协同过滤算法.博士学位论文.北京:北京大学, 2010.
(WU J L. Collaborative Filtering Algorithm in Prize Netflix. Ph.D Dissertation. Beijing, China: Peking University, 2010.)
[6] DRINEAS P, KANNAN R. Pass Efficient Algorithms for Approximating Large Matrices // Proc of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. Philadelphia, USA: Society for Industrial and Applied Mathematics Philadelphia, 2003: 223-232.
[7] BOUTSIDIS C, WOODRUFF D P. Optimal CUR Matrix Decompositions // Proc of the 46th Annual ACM Symposium on Theory of Computing. New York, USA: ACM, 2014: 353-362.
[8] RAJARAMAN A, ULLMAN J D. Mining of Massive Datasets[DB/OL]. [2016-03-07]. http://infolab.stanford.edu/~ullman/mmds/book.pdf.
[9] DRINEAS P, MAHONEY M W, Muthukrishnan S. Relative-Error CUR Matrix Decompositions. SIAM Journal on Matrix Analysis and Applications, 2008, 30(2): 844-811.
[10] WEIMER M, KARATZOGLOU A, SMOLA A. Improving Maximum Margin Matrix Factorization. Machine Learning, 2008, 72(3): 263-276.
[11] CHU M T, LIN M M. Low-Dimensional Polytope Approximation and Its Applications to Nonnegative Matrix Factorization. SIAM Journal on Scientific Computing, 2008, 30(3): 1131-1155.
[12] OCEPEK U, RUGELJ J, BOSNIC Z. Improving Matrix Factorization Recommendations for Examples in Cold Start. Expert Systems with Applications: An International Journal, 2015, 42(19): 6784-6794.
[13] CHICKERING D M, HECKERMAN D. Fast Learning from Sparse Data // Proc of the 15th Conference on Uncertainty in Artificial Intelligence. San Francisco, USA: Morgan Kaufmann Publishers, 1999: 109-115.
[14] ZHOU X W, YANG C, ZHAO H Y, et al. Low-Rank Modeling and Its Applications in Image Analysis. ACM Computing Surveys, 2014, 47(2): 36:1-36:33.
[15] SPRECHMANN P, BRONSTEIN A M, SAPIRO G. Learning Efficient Sparse and Low Rank Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 37(9): 1821-1833.
[16] DRINEAS P, KANNAN R, MAHONEY M W. Fast Montecarlo Algorithms for Matrices II: Computing a Low-Rank Approximation
to a Matrix. SIAM Journal on Computing, 2006, 36(1): 158-183.
[17] 刘惊雷.CP-nets及其表达能力研究.自动化学报, 2011, 37(3): 290-302.
(LIU J L. Research on CP-nets and Its Expressive Power. Acta Automatica Sinica, 2011, 37(3): 290-302.)
[18] ACHLIOPTAS D, MCSHERRY F. Fast Computation of Low-Rank Matrix Approximations. Journal of the ACM, 2007, 54(2). DOI: 10.1145/1219092.1219097.
[19] DRINEAS P, KANNAN R, MAHOHEY M W. Fast Montecarlo Algorithms for Matrices I: Approximating Matrix Multiplication. SIAM Journal on Computing, 2006, 36(1): 132-157.
[20] MAHONEY M W, DRINEAS P, KLEINBERG J. CUR Matrix Decompositions for Improved Data Analysis. Proceedings of the National Academy of Sciences of the United States of America, 2009, 106(3): 697-702.