Abstract:Kernel algorithm is a new nonlinear technique in the machine learning field with great practical value. In the Kernel algorithm Kernel Matrix serves as a bridge between datain and learning algorithms. The lowrank approximation of Kernel Matrix is an effective means of advancing the computational efficiency of Kernel algorithm and reducing the internal memory. Based on the correlation between data and minimal residual norm, the forward greedy algorithm, the backward greedy algorithm and the mixed greedy algorithm are proposed to look for the optimum lowrank approximation. A sparse regression algorithm (SRA) is presented which effectively cuts off “training samples” and keeps good generalization capacity. With comparison to SVM,KA and KPCR, the superiority of SRA is demonstrated on two datasets.
[1] ShaweTaylor J, Cristianini N. Kernel Methods for Pattern Analysis. Cambridge, UK: Cambridge University Press, 2004 [2] Müller K R, Mika S, Ratsch G, et al. An Introduction to Kernel Based Learning Algorithm. IEEE Trans on Neural Networks, 2001, 12(2): 181201 [3] Fine S, Scheinberg K. Efficient SVM Training Using LowRank Kernel Representations. Journal of Machine Learning Research, 2002, 2(2): 243264 [4] Smola A J, Schlkopf B. Sparse Greedy Matrix Approximation for Machine Learning // Proc of the 17th International Conference on Machine Learning.San Francisco, USA, 2000: 911918 [5] Chandrasekaran S, Ipsen I. On Rank Revealing QR Factorizations. SIAM Journal on Matrix Analysis and Applications, 1994, 15(2): 592622 [6] Vapnik V N. The Nature of Statistical Learning Theory. 2nd Edition. New York, USA: SpringerVerlag, 1999 [7] Scholkpf B, Smola A J. Learning with Kernels-Support Vector Machines, Regularization, Optimization and Beyond. Cambridge, USA: MIT Press, 2001 [8] Xu Jianhua, Zhang Xuegong. Nonlinear Kernel Forms of Classical Linear Algorithms. Control and Decision, 2006, 21(1): 16,12 (in Chinese) (许建华,张学工.经典线性算法的非线性核形式.控制与决策, 2006, 21(1): 16,12) [9] Zhang Zhenyue, Zha Hongyuan. Linear LowRank Approximation and NonLinear Dimensionality Reduction. Science in China: Series A, 2005, 35(3): 273285 (in Chinese) (张振跃,查宏远.线性低秩逼近与非线性降维.中国科学A辑, 2005, 35(3): 273285) [10] Chan T F, Hansen P C. Some Applications of the Rank Revealing QR Factorizations. SIAM Journal on Scientific and Statistical Computing, 1992, 13(3): 727741 [11] Ye Huanqiu. The LimitRank Maximum Subset Problem. Master Dissertation. Hangzhou, China: Zhejiang University. Department of Mathematics, 2001 (in Chinese) (叶环球.限秩最大子集问题.硕士学位论文.杭州:浙江大学.数学系, 2001) [12] Stewart G W. An Updating Algorithm for Subspace Tracking. IEEE Trans on Signal Processing, 1992, 40(6): 15351541 [13] Jia Zhongxiao, Zhang Ping. Two Refined Lanczos Algorithms for Computing the Largest/Smallest Singular Values and Associated Singular Vectors of a Large Matrix. Mathematica Numerica Sinica, 2003, 25(3): 293304 (in Chinese) (贾仲孝,张 萍.计算大规模矩阵最大最小奇异值和奇异向量的两个精化Lanczos算法.计算数学, 2003, 25(3): 293304) [14] Cox D D, O’Sullivant F. Asymptotic Analysis of Penalized Likelihood and Related Estimators. Annals of Statistics, 1990, 18(4): 16761695 [15] Poggio T, Smale S. The Mathematics of Learning: Dealing with Data. Notices of American Mathematical Society, 2003, 50(5): 537544 [16] Cucker F, Smale S. Best Choices for Regularization Parameters in Learning Theory: On the BiasVariance Problem. Foundations of Computational Mathematics, 2002, 2(4): 413 428 [17] Simon H. Neural Networks: A Comprehensive Foundation. 2nd Edition. Upper Saddle River, USA: Prentice Hall, 1999 [18] Franc V. Statistical Pattern Recognition Toolbox for Matlab [DB/OL]. [20051102]. http://cmpfelk.cvut.cz/cmp/software/stprool/index.html