Abstract:A method named Simulated Cutting Algorithm (SCA) is introduced for SVM incremental learning. SCA computes the anticipated contribution for the mapped target of each training sample in feature space mapped by a kernel function, and then chooses samples with higher anticipated contribution for SVM incremental learning. It effectively solves the problems in traditional incremental learning, such as higher training cost, lower accuracy for selecting target samples and lacking robustness. The anticipated contribution rate of a sample is indicated by the recognition rate towards two classes of samples of an appropriate separating hyperplane going through the mapped target of this sample point. Since the way for choosing target samples is very similar to that for paring garden stuff, the proposed algorithm acquires its name from this. Numerical experiments on benchmark datasets show the proposed method is superior in learning efficiency and generalization performance of a classifier. The application of the proposed algorithm in learning with limited resources demonstrates its excellent performance in large-scale learning tasks.
[1] Vapnik V N. The Nature of Statistical Learning Theory. London, UK: Spring-Verlag, 1995 [2] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167 [3] Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000 [4] Jia Hongbin, Murphey Y L, Gutchess D, et al. Identifying Knowledge Domain and Incremental New Class Learning in SVM // Proc of the IEEE International Joint Conference on Neural Networks. Montréal, Canada, 2005, Ⅴ: 2742-2747 [5] Wan Sheng, Banta L E. Parameter Incremental Learning Algorithm for Neural Networks. IEEE Trans on Neural Networks, 2006, 17(6): 1424-1438 [6] Sang Nong, Zhang Rong, Zhang Tianxu. Incremental Learning Algorithm of a Modified Minimum Distance Classifier. Pattern Recognition and Artificial Intelligence, 2007, 20(3): 358-364 (in Chinese) (桑 农,张 荣,张天序.一类改进的最小距离分类器的增量学习算法.模式识别与人工智能, 2007, 20(3): 358-364) [7] Li Kai, Huang Houkuan. Research on Incremental Learning Algorithm of Support Vector Machine. Journal of Northern Jiaotong University, 2003, 27(5): 34-37 (in Chinese) (李 凯,黄厚宽.支持向量机增量学习算法研究.北方交通大学学报, 2003, 27(5): 34-37) [8] Kong Rui, Zhang Bing. A Fast Incremental Learning Algorithm for Support Vector Machine. Control and Decision, 2005, 20(10): 1129-1136 (in Chinese) (孔 锐,张 冰.一种快速支持向量机增量学习算法.控制与决策, 2005, 20(10): 1129-1136) [9] Mitra P, Murthy C A, Pal S K. Data Condensation in Large Databases by Incremental Learning with Support Vector Machines // Proc of the International Conference on Pattern Recognition. Barcelona, Spain, 2000, Ⅱ: 2708-2711 [10] Xiao Xianbo, Hu Guangshu. An Incremental Support Vector Machine Based Speech Activity Detection Algorithm // Proc of the 27th Annual International Conference of the Engineering in Medicine and Biology Society. Shanghai, China, 2005: 4224-4226 [11] Domeniconi C, Gunopulos D. Incremental Support Vector Machine Construction // Proc of the IEEE International Conference on Data Mining. San Jose, USA, 2001: 589-592 [12] An Jinlong, Wang Zhengou, Ma Zhenping. An Incremental Learning Algorithm for Support Vector Machine // Proc of the 2nd International Conference on Machine Learning and Cybernetics. Xian, China, 2003, Ⅱ: 1153-1156 [13] Katagiri S, Abe S. Selecting Support Vector Candidates for Incremental Training // Proc of the IEEE International Conference on Systems, Man and Cybernetics. Waikoloa, USA, 2005, Ⅱ: 1258-1263 [14] Katagiri S, Abe S. Incremental Training of Support Vector Machines Using Hyperspheres. Pattern Recognition Letters, 2006, 27(13): 1495-1507 [15] Shawe-Taylor J, Cristianini N. Kernel Methods for Pattern Analysis. Cambridge, UK: Cambridge University Press, 2004 [16] Joachims T. Making Large-Scale SVM Learning Practical // Schlkopf B, Burges C J C, Smola A, eds. Advances in Kernel Methods: Support Vector Machines. Cambridge, USA: MIT Press, 1999: 169-184 [17] Newman D J, Hettich S, Blake C L, et al. UCI Repository of Machine Learning Databases [DB/OL]. [2009-03-20], http://www.ics.uci.edu/~mlearn/MLRepository.html [18] DELVE-Benchmark Repository-A Collection of Artificial and Real-World Data Sets [DB/OL]. [2009-03-20]. http://www.cs.utoronto.ca/~delve/data/datasets.html [19] Benchmark Repository Used for the STATLOG Competition. [1996-03-20]. ftp://ftp.ncc.up.pt/pub/statlog [20] Laskov P. Feasible Direction Decomposition Algorithms for Training Support Vector Machines. Machine Learning, 2002, 46(1/2/3): 315-349 [21] Platt J C. Fast Training of Support Vector Machines Using Sequential Minimal Optimization // Schlkopf B, Burges C J C, Smola A J, eds. Advances in Kernel Methods: Support Vector Learning. Cambridge, USA: MIT Press, 1999: 185-208 [22] Laskov P, Gehl C, Krüger S, et al. Incremental Support Vector Learning: Analysis, Implementation and Applications. Journal of Machine Learning Research, 2006, 7: 1909-1936 [23] Cauwenberghs G, Poggio T. Incremental and Decremental Support Vector Machine Learning // Leen T K, Dietterich T G, Tresp V, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2001, XIII: 409-415 [24] Bordes A, Ertekin S, Wesdon J, et al. Fast Kernel Classifiers for Online and Active Learning. Journal of Machine Learning Research, 2005, 6: 1579-1619 [25] Kivinen J, Smola A J, Williamson R C. Online Learning with Kernels. IEEE Trans on Signal Processing, 2004, 52(8): 2165-2176 [26] Cavallanti G, Cesa-Bianchi N, Gentile C. Tracking the Best Hyperplane with a Simple Budget Perceptron. Machine Learning, 2007, 69(2/3): 143-167