Abstract:Aimming at the problems of relevance vector machine (RVM) classification such as low precision and difficulty in kernel parameter selection, a concept called critical sliding threshold is presented in this paper. A classifier combining RVM with K nearest neighbour (KNN) called KNN-RVM classifier is constructed. In theory, three theorems is proposed and proved. The first is that the process of KNN-RVM classification is equivalent to an implementation of soft margin SVM. The second is that KNN-RVM classifier is equivalent to a 1NN classifier in which only one representative point is selected for each class. The last is the result of KNN-RVM classification is superior to that of RVM classification. The sliding and critical characteristics of critical sliding threshold are proved using three different datasets. The veracity, adaptability and global optimality of KNN-RVM classifier are proved as well. The KNN-RVM classifier improves the classification precision, reduces the reliance of algorithm on the kernel parameter, and thereby is proved to be an effective and excellent classifier.
[1] Nello C, John S T. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge, UK: Cambridge University Press, 2000 [2] Vapnik V N. The Nature of Statistical Learning. New York, USA: Springer-Verlag, 1995 [3] Vapnik V N. An Overview of Statistical Learning Theory. IEEE Trans on Neural Networks, 1999, 10(5): 988-999 [4] Jain A K, Duin R P W, Mao Jianchang. Statistical Pattern Recognition: A Review. Pattern Analysis and Machine Intelligence, 2000: 22(1): 4-37 [5] Mitchell T. Machine Learning. New York, USA: McGraw Hill, 1997 [6] Vapnik V N. Statistical Learning Theory. New York, USA: Springer-Verlag, 1999 [7] Herbrich R. Learning Kernel Classifiers: Theory and Algorithms. Cambridge, USA: MIT Press, 2002 [8] Tipping M E. The Relevance Vector Machines // Solla S A, Leen T K, Muller K R, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2000, Ⅻ: 652-658 [9] Tipping M E. Sparse Bayesian Models ( the RVM) [EB/OL]. [2009-10-01]. http://www.miketipping.com/index.php?page=rvm [10] Tipping M E. Sparse Bayesian Learning and the Relevance Vector Machine. Journal of Machine Learning Research, 2001, 1(3): 211-244 [11] Li Rong, Ye Shiwei, Shi Zhongzhi. SVM-KNN Classifier-A New Method of Improving the Accuracy of SVM Classifier. Chinese Journal of Electronics, 2002, 30(5): 745-748 (in Chinese) (李 蓉,叶世伟,史忠植.SVM-KNN分类器——一种提高SVM分类精度的新方法.电子学报, 2002, 30(5): 745-748) [12] Mackay D J C. Bayesian Non-Linear Modeling for the Prediction Competition. ASHRAE Trans, 1994, 100(2): 1053-1062 [13] Neal R M. Bayesian Learning for Neural Networks. New York, USA: Springer, 1996 [14] Mackay D J C. Bayesian Interpolation. Neural Computation, 1992, 4(3): 415-447 [15] Aha D W. Lazy Learning. Dordrecht, Netherlands: Kluwer Academic, 1997 [16] Li Baoli, Qin Lu, Yu Shiwen. An Adaptive k-Nearest Neighbor Text Categorization Strategy. ACM Trans on Asian Language Information Processing, 2004, 3(4): 215-226 [17] Han E H, Karypis G, Kumar V. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification // Proc of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Hongkong, China, 2001: 53-65 [18] Roweis S. Data for Matlab Hakers [DB/OL]. [2009-10-01]. http://www.cs.toronto.edu/~roweis/data.html