Semi-Supervised Learning Based on One-Class Classification
MIAO Zhi-Min1, ZHAO Lu-Wen1, HU Gu-Yu2,WANG Qiong2
1.Institute of Communication Engineering, PLA University of Science and Technology, Nanjing 210007 2.Institute of Command Automation, PLA University of Science and Technology, Nanjing 210007
Abstract:A semi-supervised learning algorithm is proposed based on one-class classification. Firstly, one-class classifications are built respectively for each class of data on labeled dataset. Then, some unlabeled data are tested by these one-class classifications. The classification results are used to adjust and optimize two classification surfaces. All labeled data and some recognized unlabeled data are used to train a base classifier. According to the classifying results of the base classifiers, the label of the test sample is determined. Experimental results on UCI datasets illustrate that the average detection precision of the proposed algorithm is 4.5% higher than that of the tri-training algorithm and 8.9% higher than that of the classifier trained by pure labeled data.
[1] D'Alche B F, Grandvalet Y, Ambroise C. Semi-Supervised Margin-Boost // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2002, 14: 553-560 [2] Goldman S A, Zhou Yan. Enhancing Supervised Learning with Unlabeled Data// Proc of the 17th International Conference on Machine Learning. Standford, USA, 2000: 327-334 [3] Blum A, Chawla S. Learning from Labeled and Unlabeled Data Using Graph Min-Cuts // Proc of the 18th International Conference on Machine Learning. Williams College, USA, 2001: 19-26 [4] Nigam K, McCallum A K, Thrun S, et al. Text Classification from Labeled and Unlabeled Documents Using EM. Machine Learning, 2000, 39(2/3): 103-134 [5] Wu Ying, Huang T S, Toyama K. Self-Supervised Learning for Object Recognition Based on Kernel Discriminate-EM Algorithm // Proc of the IEEE International Conference on Computer Vision. Vancouver, Canada, 2001: 275-280 [6] Hwa R, Osborne M, Sarkar A, et al. Corrected Co-Training for Statistical Parsers // Proc of the ICML Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. Washington, USA, 2003: 95-102 [7] Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-Training // Proc of the 11th Annual Conference on Computational Learning Theory. Madison, USA, 1998: 92-100 [8] Zhou Zhihua, Li Ming. Tri-training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Trans on Knowledge and Data Engineering, 2005, 17(11): 1529 -1540 [9] Tax D M J. One-Class Classification: Concept Learning in the Absence of Counter-Examples. Ph.D Dissertation. Delft, Netherlands: Delft University of Technology. Faculty of Information Technology and Systems, 2001 [10] Duda R O, Hart P E, Stork D G. Pattern Classification. 2nd Edition. New York, USA: John Wiley & Sons, 2001 [11] Manevitz L M, Yousef M. One-Class SVMs for Document Classification. Journal of Machine Learning Research, 2001, 2(2):139-154 [12] Rtsch G, Schlkopf B, Mika S, et al. SVM and Boosting: One Class. Technical Report, 119, Berlin, Germany: GMD FIRST, 2000 [13] Campbell C, Bennett K P. A Linear Programming Approach to Novelty Detection // Leen T K, Dietterich T G, Tresp V, eds. Advances in Neural Information Processing System. Cambridge, USA: MIT Press, 2001, 13: 203-208 [14] Chen Yunqiang, Zhou Xiang, Huang T S. One-Class SVM for Learning in Image Retrieval // Proc of the IEEE International Conference on Image Processing. Thessaloniki, Greece, 2001: 34-37 [15] Nigam K, Ghani R. Analyzing the Effectiveness and Applicability of Co-Training // Proc of the 9th ACM International Conference on Information and Knowledge Management. McLean, USA, 2000: 86-93 [16] Pierce D, Cardie C. Limitations of Co-Training for Natural Language Learning from Large Data Sets // Proc of the Conference on Empirical Methods in Natural Language Processing. Pittsburgh, USA, 2001: 1-9 [17] Zhou Zhihua, Li Ming. Semi-Supervised Regression with Co-Training Style Algorithms. IEEE Trans on Knowledge and Data Engineering, 2007, 19(11): 1479-1493 [18] Nordita P S, Sollich P, Krogh A. Learning with Ensembles: How Overfitting Can Be Useful // Mozer M C, Jordan M I, Petsche T, eds. Advances in Neural Information Processing Systems. Cambridge, UK: MIT Press, 1996: 190-196 [19] Blake C, Keogh E, Merz C J. UCI Repository of Machine Learning Databases[DB/OL].[2007-05-21]. http://www.ics.uci.edu/~mlearn/MLRepository.html