1.Faculty of Computer,Guangdong University of Technology,Guangzhou 510006 2.School of Mathematical Science,South China University of Technology,Guangzhou 510641
Abstract:A robust least squares support vector machine (RLS-SVM) algorithm for regression is proposed based on the recursive outlier-elimination. In each loop, the sample with the largest error is investigated and diagnosed by statistical hypothesis testing. If the sample is diagnosed as an outlier, it is eliminated and the LS-SVM is re-trained by using the rest samples to provide more accurate information for the successive outlier diagnosis and elimination. The decremental-learning method is introduced into the re-training stage to reduce the computations. Thus, the additional computational complexity of RLS-SVM is less than O(N3). Experimental results on simulated and real-world datasets demonstrate the validity of the proposed algorithm and reveal the potential of the algorithm in building an outlier detector.
[1] Zhang Xuegong. Introduction to Statistical Learning Theory and Support Vector Machines. Acta Automatica Sinica, 2000, 26(1): 32-42 (in Chinese) (张学工.关于统计学习理论与支持向量机.自动化学报, 2000, 26(1): 32-42)
[2] Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: John Wiley Sons, 1995 [3] Cortes C. Prediction of Generalization Ability in Learning Machines. Ph.D Dissertation. Rochester, USA: University of Rochester. Department of Computer Science, 1993 [4] Cao L J, Tay F E H. Support Vector Machine with Adaptive Parameters in Financial Time Series Forecasting. IEEE Trans on Neural Networks, 2003, 14(6): 1506-1518 [5] Burges C J C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167 [6] Smola A J, Schlkopf B. A Tutorial on Support Vector Regression. Statistics and Computing, 2004, 14(3): 199-222 [7] Suykens J A K, Vandewa J. Least Squares Support Vector Machine Classifiers. Neural Processing Letters, 1999, 9(3): 293-300 [8] Huber P J. Robust Statistics. New York, USA: John Wiley Sons, 1981 [9] Smola A J, Schlkopf B. Bayesian Kernel Methods // Mendelson S, Smola A J, eds. Advanced Lectures on Machine Learning. New York, USA: Springer, 2003: 65-117 [10] Suykens J A K, Brabanter J D, Lukas L, et al. Weighted Least Squares Support Vector Machines: Robustness and Sparse Approximation. Neurocomputing, 2002, 48(1): 85-105 [11] Tian Shengfeng, Huang Houkuan. A Simplification Algorithm to Support Vector Machines for Regression. Journal of Software, 2002, 13(6): 1169-1172 (in Chinese) (田盛丰,黄厚宽.回归型支持向量机的简化算法.软件学报, 2002, 13(6): 1169-1172) [12] Zhang Jiangshe, Guo Gao. Reweighted Robust Support Vector Regression Method. Chinese Journal of Computers, 2005, 28(7): 1171-1177 (in Chinese) (张讲社,郭 高.加权稳健支撑向量回归方法.计算机学报, 2005, 28(7): 1171-1177) [13] Chuang C C, Su S F, Jeng J T, et al. Robust Support Vector Regression Networks for Function Approximation with Outliers. IEEE Trans on Neural Networks, 2002, 13(6): 1322-1330 [14] Weisberg S. Applied Linear Regression. 2nd Edition.New York, USA: John Wiley Sons, 1985 [15] Cawley G C, Talbot N L C. Fast Exact Leave-One-Out Cross-Validation of Sparse Least-Squares Support Vector Machines. Neural Networks, 2004, 17(10): 1467-1475 [16] Zhao Ying, Keong K C. Fast Leave-One-Out Evaluation and Improvement on Inference for LS-SVMs // Proc of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004, Ⅲ: 494-497 [17] Rousseeuw P J, Leroy A. Robust Regression and Outlier Detection. New York, USA: John Wiley Sons, 1987 [18] Kvalseth T O. Cautionary Note About R2. The American Statistician, 1985, 39(4), 279-285 [19] Chwirut D. Chwirut Data Set [DB/OL]. [2008-10-28]. http://www.itl.nist.gov/div898/strd/nls/data/chwirut1.shtml [20] Nelson W. Nelson Data Set [DB/OL]. [2008-10-27]. http://www.itl.nist.gov/div898/strd/nls/data/nelson.shtml [21] Housing Data Set [DB/OL]. [2008-10-27]. http://archive.ics.uci.edu/ml/datasets/Housing [22] Auto MPG Data Set [DB/OL]. [2008-10-27]. http://archive.ics.uci.edu/ml/datasets/Auto+MPG