Adaptive Sequence Learning and Applications for Multi-Scale Kernel Method
WANG Hong-Qiao1,2, CAI Yan-Ning2, SUN Fu-Chun1, ZHAO Zong-Tao2
1.Department of Computer Science and Technology, Tsinghua University, Beijing 100084 2.Department of Command Automation, The Second Artillery Engineering College, Xian 710025
Abstract:Multi-scale kernel method is a hotspot of current kernel machine learning field. However, in the multiple kernel processing progress of multi-scale kernel learning methods, there are some disadvantages, such as average combination of kernels, time consumption increasing under iterative training and empirical selection of composite coefficients. Based on the kernel target alignment heuristics, an adaptive sequence learning algorithm for multi-scale kernel method is presented and the weighting coefficients of multiple kernels can be obtained automatically and rapidly. The experimental results testify that the proposed algorithm has better performance and stability in regression precision and classification accuracy than the SVM methods using different single kernels. Moreover, the proposed algorithm has good universal applicability.
[1] Schelkopf B, Smola A, Müller K R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 1998, 10(5): 1299-1319 [2] Schlkopf B, Mika S, Burges C J C, et al. Input Space versus Feature Space in Kernel-Based Methods. IEEE Trans on Neural Networks, 1999, 10(5): 1000-1017 [3] Müller K R, Mika S, Rtsch G, et al. An Introduction to Kernel Based Learning Algorithms. IEEE Trans on Neural Networks, 2001, 12(2): 181-201 [4] Vapnik V N. The Nature of Statistical Learning Theory. Berlin, Germany: Springer, 1995 [5] Vapnik V N. Statistical Learning Theory. New York, USA: Wiley, 1998 [6] Smola A J, Schlkopf B. A Tutorial on Support Vector Regression. Statistics and Computing, 2004, 14(3): 199-222 [7] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167 [8] Kerm P V. Adaptive Kernel Density Estimation. The Stata Journal, 2003, 3: 148-156 [9] Schlkopf B, Mika S, Smola A, et al. Kernel PCA Pattern Reconstruction via Approximate on Pre-Images // Proc of the International Conference on Artificial Neural Networks. Skvde, Sweden, 1998: 147-152 [10] Pavlidis P, Weston J, Cai J, et al. Gene Functional Classification from Heterogeneous Data // Proc of the 5th Annual International Conference on Computational Molecular Biology. Montreal, Canada, 2001: 242-248 [11] Rakotomamonjy A, Bach F R, Canu S, et al. More Efficiency in Multiple Kernel Learning // Proc of the 24th International Conference on Machine Learning. Corvalis, USA, 2007: 775-782 [12] Ong C S, Smola A J, Williamson R C. Learning the Kernel with Hyperkernels. Journal of Machine Learning Research, 2005, 6: 1043-1071 [13] Zheng Dannian, Wang Jiaxin, Zhao Yannan. Non-Flat Function Estimation with a Multi-Scale Support Vector Regression. Neurocomputing, 2006, 70(1/2/3): 420-429 [14] Bach F R, Lanckriet G R G, Jordan M I. Multiple Kernel Learning, Conic Duality, and the SMO Algorithm [EB/OL]. [2004-07-04]. http://www.di.ens.fr/~fbach/skm_icml.pdf [15] Sonnenburg S, Rtsch G, Schfer C. A General and Efficient Multiple Kernel Learning Algorithm // Proc of the 19th Annual Conference on Neural Information Processing Systems. Vancouver, Canada, 2005: 1273-1280 [16] Bach F R. Consistency of the Group Lasso and Multiple Kernel Learning. Journal of Machine Learning Research, 2008, 9: 1179-1225 [17] Kloft M, Brefeld U, Laskov P, et al. Non-Sparse Multiple Kernel Learning [EB/OL]. [2008-12-15]. eprints.pascal-network.org/archive/00004977/01/ws_mkl.pdf [18] Zien A, Ong C S. Multiclass Multiple Kernel Learning // Proc of the 24th International Conference on Machine Learning. New York, USA, 2007: 1191-1198 [19] Gnen M, Alpaydin E. Localized Multiple Kernel Learning// Proc of the 25th International Conference on Machine Learning. Helsinki, Finland, 2008: 352-359 [20] Rakotomamonjy A, Bach F R, Canu S, et al. SimpleMKL. Journal of Machine Learning Research, 2008, 9: 2491-2521 [21] Damoulas T, Girolami M A. Pattern Recognition with a Bayesian Kernel Combination Machine. Pattern Recognition Letters, 2009, 30(1): 46-54 [22] Kingsbury N, Tay D B H, Palaniswami M. Multi-Scale Kernel Methods for Classification // Proc of the IEEE Workshop on Machine Learning for Signal Processing. Mystic, USA, 2005: 43-48 [23] Zheng Danian, Wang Jiaxin, Zhao Yannan. Time Series Predictions Using Multi-Scale Support Vector Regressions // Proc of the 3rd International Conference on Theory and Applications of Models of Computation. Beijing, China, 2006: 474-481 [24] Yang Zhen, Guo Jun, Xu Weiran, et al. Multi-Scale Support Vector Machine for Regression Estimation // Proc of the 3rd International Symposium on Neural Networks. Chengdu, China, 2006: 1030-1037 [25] Pozdnoukhov A, Kanevski M. Multi-Scale Support Vector Algorithms for Hot Spot Detection and Modeling. Journal of Stochastic Environmental Research and Risk Assessment, 2007, 22(5): 647-660 [26] Li Bin, Zheng Danian, Sun Lifeng, et al. Exploiting Multi-Scale Support Vector Regression for Image Compression. Neurocomputing, 2007, 70(16/17/18): 3068-3074 [27] Opfer R. Multiscale Kernels. Advances in Computational Mathematics, 2006, 25(4): 357-380 [28] Zhou Yatong, Zhang Taiyi, Li Xiaohe. Multi-Scale Gaussian Processes Model. Journal of Electronics (China), 2006, 23(4): 618-622 [29] Walder C, Kim K I, Schlkopf B. Sparse Multiscale Gaussian Process Regression // Proc of the 25th International Conference on Machine Learning. Helsinki, Finland, 2008: 1112-1119 [30] Zheng Danian. Research on Kernel Methods in Machine Learning. Ph.D Dissertation. Beijing, China: Tsinghua University. Department of Computer Science and Technology, 2006 (in Chinese) (郑大念.机器学习中的核方法研究.博士学位论文.北京:清华大学.计算机科学与技术系, 2006) [31] Lin Y Y, Liu T L, Fuh C S. Local Ensemble Kernel Learning for Object Category Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA, 2007: 1-8 [32] Qiu Shibin, Lane T. Multiple Kernel Support Vector Regression for siRNA Efficacy Prediction // Proc of the International Workshop on Sequences, Subsequences and Consequences. Los Angeles, USA, 2008: 367-378 [33] Cristianini N, Shawe-Taylor J, Elisseeff A, et al. On Kernel-Target Alignment // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2002, XIV: 367-373 [34] Cristianini N, Elisseeff A, Shawe-Taylor J. On Optimizing Kernel Alignment [EB/OL]. [2001-04-28]. http://www.iipl.fudan.edu.cn/~zhangjp/literatures/cluster %20 analysis/01087.ps