Proximal Support Vector Machines for Samples with Unbalanced Classification
TAO XiaoYan1,2, JI HongBing1, Dong ShuFu2
1.School of Electronic Engineering, Xidian University, Xi’an 710071 2.Institute of Telecommunication Engineering, Air Force Engineering University, Xi’an 710077
Abstract:Aiming at the problem that unbalanced data classification is disregarded in the standard Proximal Support Vector Machines (PSVM), a modified PSVM algorithm is presented, namely MPSVM. The different penalty factors are assigned to the positive and negative training sets according to the unbalanced population. The penalty values are transformed into a diagonal matrix. Then the decision functions for the linear and nonlinear MPSVM are achieved. Finally, the comparisons of algorithmic principle and performance are drawn. The experimental results show that MPSVM has a better generalization performance than PSVM and higher efficiency than the unbalanced SVM.
[1] Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: SpringerVerlag, 2000 [2] Cortes C, Vapnik V. Support Vector Networks. Machine Learning, 1995, 20(3): 273297 [3] Drucker H, Burges C J C, Kaufman L, et al. Support Vector Regression Machines // Mozer M C, Jordan M I, Petsche T, eds. Advances in Neural Information Processing Systems. Cambridge, UK: MIT Press, 1997: 155161 [4] Platt J C. Fast Training of Support Vector Machines Using Sequential Minimal Optimization // Schlkopf B, Burges C, Smola A, eds. Advances in Kernel MethodsSupport Vector Learning. Cambridge, UK: MIT Press, 1999: 185208 [5] Osuna E, Freund R, Girosi F. An Improved Training Algorithm for Support Vector Machines // Proc of the International Workshop on Neural Networks for Signal Processing. Amelia Island, USA, 1997: 276285 [6] Keerthi S, Shevade S, Bhattcharyya C, et al. Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Computation, 2001, 13(3): 637649 [7] Suykens J A K, Vandewalle J. Least Squares Support Vector Machines. Neural Network Letters, 1999, 9(3): 293300 [8] Mangasarian O L. Generalized Support Vector Machines // Smola A, Bartlett P, Schlkopf B, et al, eds. Advances in Large Margin Classifier. Cambridge, UK: MIT Press, 2000: 135146 [9] Fung G, Mangasarian O L. Proximal Support Vector Machine Classifiers // Proc of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA, 2001: 7786 [10] Agarwal D K, DuMouchel W. Shrinkage Estimator Generalizations of Proximal Support Vector Machines // Proc of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, 2002: 173182 [11] Chew H G, Crisp D J, Bogner R E, et al. Target Detection in Radar Imagery Using Support Vector Machines with Training Size Biasing [EB/OL]. [20010101]. http://users.on.net/~hgchew/SVM/ChewCrispBognerLimICARCV2000.pdf [12] Chew H G, Bogner R E, Lim C C. Dual vSupport Vector Machines with Error Rate and Training Size Biasing // Proc of the International Conference on Acoustics, Speech and Signal Processing. Salt Lake City, USA, 2001: 12691272 [13] Lin C F, Wang S D. Fuzzy Support Vector Machines. IEEE Trans on Neural Networks, 2002, 13(2): 464471 [14] Tao Qin, Wu Gaowei, Wang Feiyue, et al. Posterior Probability Support Vector Machines for Unbalanced Data. IEEE Trans on Neural Networks, 2005, 16(6): 15611573 [15] Golub G H, van Loan C C. Matrix Computations. Baltimore, USA: The John Hopkins University Press, 1996 [16] Lee Y J, Mangasarian O L. RSVM: Reduced Support Vector Machines. Technical Report, 0007, Madison, USA: University of Wisconsin. Data Mining Institute, 2000 [17] Murphy M. UCIBenchmark Repository of Artificial and Real Data Sets [DB/OL]. [20060401]. http://www.ics.uci.edu/~mlearn [18] Mitchell T M. Machine Learning. Boston, USA: McGrawHill, 1997