A Fast Learning Algorithm Based on Minimum Enclosing Ballfor Large Domain Adaptation
XU Min1,2,WANG Shi-Tong1,GU Xin1,3,YU Lin2
1.School of Digital Media,Jiangnan University,Wuxi 214122 2.School of Internet of Things Engineering,Wuxi Institute of Technology,Wuxi 214121 3.Wuxi Northern Lake Optical Co.,Ltd.,Wuxi 214035
Abstract:The data fields detected from different times,places or devices are not always complete even if they come from the same data resource. To solve the problem of effectively transferring the knowledge between the two fields,the theorem is proposed that the difference between two probability distributions from two domains can be expressed by the center of each domain′s minimum enclosing ball and its up limit has nothing to do with the radius. Based on the theorem,a fast center calibration domain adaptive algorithm,center calibration-core sets support vector data description (CC-CSVDD),is proposed for large domain adaptation by modifying the original support vector domain description (SVDD) algorithm. The validity of the proposed algorithm is experimentally verified on the artificial datasets and the real KDD CUP-99 datasets. Experimental results show that the proposed algorithm has good performance.
[1]Yang J,Yan R,Hauptmann A G. Cross-Domain Video Concept Detection Using Adaptive SVMs // Proc of the 15th International Conference on Multimedia. Augsburg,Germany,2007: 188-197 [2] Blitzer J,McDonald R,Pereira F. Domain Adaptation with Structural Correspondence Learning // Proc of the Conference on Empirical Methods in Natural Language Processing. Philadelphia,USA,2006: 120-128 [3] Pan S J,Tsang I W,Kwok J T,et al. Domain Adaptation via Transfer Component Analysis. IEEE Trans on Neural Networks,2010,22(2): 199-210 [4] Tax D M J,Duin R P W. Support Vector Domain Description. Pattern Recognition Letters,1999,20(11/12/13): 1191-1199 [5] Liu Y H,Liu Yanchen,Chen Y J. Fast Support Vector Data Descriptions for Novelty Detection. IEEE Trans on Neural Networks,2010,21(8): 1296-1313 [6] GhasemiGol M,Monsefi R,Yazdi H S. Intrusion Detection by New Data Description Method // Proc of the International Conference on Intelligent Systems,Modelling and Simulation. Liverpool,UK,2010: 1-5 [7] Tsang I W,Kwok J T,Cheung P. Core Vector Machines: Fast SVM Training on Very Large Data Sets. Journal of Machine Learning Research,2005,6(4): 363-392 [8] Badoiu M,Clarkson K L. Optimal Core Sets for Balls. Computational Geometry: Theory and Applications,2008,40(1): 14-22 [9] Tsang I W,Kwok J T,Zurada J M. Generalized Core Vector Machines. IEEE Trans on Neural Networks,2006,17(5): 1126-1140 [10] Chu C S,Tsang I W,Kwok J K. Scaling up Support Vector Data Description by Using Core-Sets // Proc of the IEEE International Joint Conference on Neural Networks. Budapest,Hungary,2004,I: 425-430 [11] Deng Zhaohong,Chung F L,Wang Shitong. FRSDE: Fast Reduced Set Density Estimator Using Minimal Enclosing Ball Approximation. Pattern Recognition,2008,41(4): 1363-1372 [12] Mark G,He C. Probability Density Estimation from Optimally Condensed Data Samples. IEEE Trans on Pattern Analysis and Machine Intelligence,2003,25(10): 1253-1264 [13] Marzio M Z,Taylor C C. Kernel Density Classification and Boosting: An L 2 Analysis. Statistics and Computing,2005,15(2): 113-123 [14] Hall P,Wand M P. On Nonparametric Discrimination Using Density Differences. Biometrika,1988,75(3): 541-547 [15] Smola A,Schlkopf B. Sparse Greedy Matrix Approximation for Machine Learning // Proc of the 17th International Conference on Machine Learning. San Francisco,USA,2000: 911-918