|
|
Cost-Sensitive SVM Based on Loss Functions with Weighted Margin |
TAO Qing, LIANG Wan-Lu, KONG Kang, Wang Qun-Shan |
New Star Research Institute of Applied Tech, Hefei 230031 |
|
|
Abstract Almost all the available algorithms deal with the imbalanced problems by directly weighting the loss functions. In this paper, a loss by weighting the margin in hinge function is proposed and its Bayesian consistency is proved. Furthermore, a learning algorithm, called Weighting Margin SVM (WMSVM), is obtained and SMO can be modified to solve WMSVM. Experimental results on certain benchmark datasets demonstrate the effectiveness of WMSVM. Both of the theoretical and experimental analysis indicate that the proposed weighted margin loss function method enriches the cost-sensitive learning.
|
Received: 03 November 2010
|
|
|
|
|
[1] Japkowicz N, Stephen S. The Class Imbalance Problem: A Systematic Study. Intelligent Data Analysis, 2002, 6(5): 429-450 [2] Chawla N V, Japkowicz N, Kotcz A. Special Issue on Class Imbalances. SIGKDD Explorations, 2004, 6(1): 1-6 [3] Elkan C. The Foundations of Cost-Sensitive Learning // Proc of the 17th International Joint Conference on Artificial Intelligence. Seattle, USA, 2001, II: 973-978 [4] Sun Yanmin, Kamela M S, Wong A K C, et al. Cost-Sensitive Boosting for Classification of Imbalanced Data. Pattern Recognition, 2007, 40(12): 3358-3378 [5] Maloof M A. Learning When Data Sets Are Imbalanced and When Costs Are Unequal and Unknown // Proc of the Workshop on Learning from Imbalanced Data Sets. Washington, USA, 2003: 1263-1284 [6] Masnadi-Shirazi H, Vasconcelos N. Risk Minimization, Probability Elicitation, and Cost-Sensitive SVMs // Proc of the 27th International Conference on Machine Learning. Haifa, Israel, 2010: 204-213 [7] Cristianini N, Schawe-Taylor J. An Introduction to Support Vector Machines. Cambridge, UK: Cambridge University Press, 2000 [8] Zhang Tong. Statistical Behavior and Consistency of Classification Methods Based on Convex Risk Minimization. Annals of Statistics, 2004, 32(1): 56-85 [9] Wang Jue, Tao Qing. Machine Learning: The State of the Art. IEEE Intelligent Systems, 2008, 23(6): 49-55 [10] Friedman J H, Hastie T, Tibshirani R. Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, 2000, 28(2): 337-407 [11] Masnadi-Shirazi H, Vasconcelos N. Asymmetric Boosting // Proc of the 24th International Conference on Machine Learning. Corvallis, USA, 2007: 609-619 [12] Fan Wei, Stolfo S J, Zhang Junxin, et al. Adacost: Misclassification Cost-Sensitive Boosting // Proc of the 16th International Conference on Machine Learning. Bled, Slovenia, 1999: 97-105 [13] Bach F R, Heckerman D, Horvitz E. Considering Cost Asymmetry in Learning Classifiers. Journal of Machine Learning Research, 2006, 7: 1713-1741 [14] Tao Qing, Wu Gaowei, Wang Feiyue, et al. Posterior Probability Support Vector Machines for Unbalanced Data. IEEE Trans on Neural Networks, 2005, 16(6): 1561-1573 [15] Gonen M, Tanugur A G, Alpaydin E. Multiclass Posterior Probability Support Vector Machines. IEEE Trans on Neural Networks, 2008, 19(1): 130-139 [16] Chang C C, Lin C J . LIBSVM: A Library for Support Vector Machines [DB/OL]. [2010-10-30]. http://www.csie.ntu.edu.tw/ cjlin/libsvm [17] Chen P H, Fan Ronger, Lin C J. A Study on SMO-Type Decomposition Methods for Support Vector Machines. IEEE Trans on Neural Networks, 2006, 17(4): 893-908 [18] Fan Rongen, Chen P H, Lin C J. Working Set Selection Using the Second Order Information for Training SVM. Journal of Machine Learning Research, 2005, 6: 1889-1918 [19] Domingos P. MetaCost: A General Method for Making Classifiers Cost-Sensitive // Proc of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, USA, 1999: 155-164 [20] Lewis D, Gale W. Training Text Classifiers by Uncertainty Sampling // Proc of the 17th Annual International ACM SIGIR Conference on Information Retrieval. Dublin, Ireland, 1998: 73-79 |
|
|
|