The existing algorithms cannot produce satisfactory results with both low calculation cost and good fitting effect, due to the regression problems based on the complex data with large scale and non-stationary variation in industry, information and other fields. Therefore, a distributed regularized regression learning algorithm based on multi-scale Gaussian kernels is proposed. The hypothesis space of the proposed algorithm is a sum space composed of reproducing kernel Hilbert spaces generated by multiple Gaussian kernels with different scales. Since each disjoint subset partitioned from the whole data set with different degree of fluctuation, kernel function approximation models with different combination coefficients are established. According to the least square regularized method, a local estimator is learned from each subset independently in the meantime. Finally, a global approximation model is obtained by weighting all the local estimators. The experimental results on two simulation datasets and four real datasets show that the proposed algorithm reduces the running time successfully with a strong fitting ability compared with the existing algorithms.
[1] CUCKER F, SMALE S. On the Mathematical Foundations of Lear-ning. Bulletin of the American Mathematical Society, 2001, 39(1): 1-49.
[2] ZHENG D N, WANG J X, ZHAO Y N. Nonflat Function Estimation with a Multi-scale Support Vector Regression. Neurocomputing, 2006, 70(1/2/3): 420-429.
[3] SAUNDERS C, GAMMERMAN A, VOVK V. Ridge Regression Learning Algorithm in Dual Variables // Proc of the 15th International Conference on Machine Learning. New York, USA: ACM, 1998: 515-521.
[4] YANG H Q, XU Z L, YE J P, et al. Efficient Sparse Generalized Multiple Kernel Learning. IEEE Transactions on Neural Networks, 2011, 22(3): 433-446.
[5] 邵喜高.基于统计学习理论的多核预测模型研究及应用.博士学位论文.长沙:中南大学, 2013.
(SHAO X G. The Research and Application of Multiple Kernel Prediction Model Based on Statistical Learning Theory. Ph.D. Dissertation. Changsha, China: Central South University, 2013.)
[6] KINGSBURY N, TAY D, PALANISWAMI M. Multi-scale Kernel Methods for Classification // Proc of the IEEE Workshop on Machine Learning for Signal Processing. Washington, USA: IEEE Press, 2005: 43-48.
[7] 汪洪桥,蔡艳宁,孙富春,等.多尺度核方法的自适应序列学习及应用.模式识别与人工智能, 2011, 24(1): 72-81.
(WANG H Q, CAI Y N, SUN F C. Adaptive Sequence Learning and Applications for Multi-scale Kernel Method. Pattern Recognition and Artificial Intelligence, 2011, 24(1): 72-81.)[8] XU Y L, CHEN D R, LI H X, et al. Least Square Regularized Regression in Sum Space. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(4): 635-646.
[9] SHAMIR O, SREBRO N. Distributed Stochastic Optimization and Learning[C/OL]. [2018-11-12]. https://arxiv.org/pdf/1408.5294.pdf.
[10] ZHANG Y C, DUCHI J, WAINWRIGHT M. Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates. Journal of Machine Learning Research, 2015, 16: 3299-3340.
[11] M CKE N, BLANCHARD G. Parallelizing Spectral Algorithms for Kernel Learning. Journal of Machine Learning Research, 2016, 19: 1-29.
[12] LIN S B, GUO X, ZHOU D X. Distributed Learning with Regula-rized Least Squares. Journal of Machine Learning Research, 2017, 18: 1-31.
[13] GUO Z C, LIN S B, ZHOU D X. Learning Theory of Distributed Spectral Algorithm. Inverse Problems, 2017, 33(7). DOI:10.1088/1361-6420/aa72b2.
[14] GUO Z C, SHI L, WU Q. Learning Theory of Distributed Regre-ssion with Bias Corrected Regularization Kernel Network. Journal of Machine Learning Research, 2017, 18: 1-25.
[15] JOSHI B, IUTZELER F, AMINI M R. Large-Scale Asynchronous Distributed Learning Based on Parameter Exchanges[J/OL]. [2018-11-12]. https://arxiv.org/pdf/1705.07751.pdf.
[16] TIKHONOV A N. Regularization of Incorrectly Posed Problems. Soviet Mathematics Doklady, 1963, 4(1): 1624-1627.
[17] ARONSZAJN N. Theory of Reproducing Kernels. Transactions of the American Mathematical Society, 1950, 68(3): 337-404.
[18] CUCKER F, ZHOU D X. Learning Theory: An Approximation Theory Viewpoint. Cambridge, UK: Cambridge University Press, 2007.