Manifold Outlier Detection Algorithm Based on Local-Correlation Dimension
HUANG Tian-Qiang1,2, LI Kai1, GUO Gong-De1
1.Department of Computer Science, School of Mathematics Computer Science, Fujian Normal University, Fuzhou 350007 2.Department of Computer Science and Technology, Tsinghua University, Beijing 100084
Abstract:Traditional outlier detection algorithm is not suitable for detection of manifold outlier. There are reports of denoising algorithm for manifold learning, but fewer reports of manifold outlier detection algorithms. Therefore, the manifold outlier detection algorithm is proposed based on the local-correlation dimension according to experimental observations. Firstly, the nature of the intrinsic dimension is discussed, and the local-correlation dimension is used to measure the manifold outlier, which is based on experimental observations. And then it is proved that the nature of outliers on manifolds can be characterized by local-correlation dimension. Finally, the manifold outlier detection algorithm based on local-correlation dimension is proposed according to the nature. The performance evaluation of the artificial data and the real data shows that the algorithm can detect manifold outliers and it has better performance than the recently reported manifold blurry mean shif algorithm.
[1] Tenenbaum J B, de Silva V, Langford J C. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 2000, 290(5500): 2319-2323 [2] Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Local Linear Embedding. Science, 2000, 290(5500): 2323-2326 [3] Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2002, XIV: 585-591 [4] Zhang Zhenyue, Zha Hongyuan. Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment. SIAM Journal of Scientific Computing, 2004, 26(1): 313-338 [5] Choi H, Choi S. Robust Kernel Isomap. Pattern Recognition, 2007, 40(3): 853-862 [6] Zhang Zhenyue, Zha Hongyuan. Local Linear Smoothing for Nonlinear Manifold Learning. Technical Report, CSE-03-003. State College, USA: Pennsylvania State University, 2003 [7] Chen Haifeng, Jiang Guofei, Yoshihira K J. Robust Nonlinear Dimensionality Reduction for Manifold Learning // Proc of the 18th International Conference on Pattern Recognition. Hong Kong, China, 2006, Ⅱ: 447-450 [8] Wang Weiran, Carreira-Perpinan M A. Manifold Blurring Mean Shift Algorithms for Manifold Denoising // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA, 2010: 1759-1766 [9] Barnett V, Lewis T. Outliers in Statistical Data. New York, USA: John Wiley, 1994 [10] Breunig M M, Kriegel H, Ng R T, et al. LOF: Identifying Density-Based Local Outliers // Proc of the 6th ACM SIGMOD International Conference on Management of Data. Dallas, USA, 2000: 93-104 [11] Grassberger P, Procaccia I. Measuring the Strangeness of Strange Attractors. Physica D: Nonlinear Phenomena, 1983, 9(1/2): 189-208 [12] Costa J A, Hero A O. Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning. IEEE Trans on Signal Processing, 2004, 52(8): 2210-2221 [13] Gionis A, Hinneburg A, Papadimitriou S, et al. Dimension Induced Clustering // Proc of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Chicago, USA, 2005: 51-60 [14] Hawkins D. Identification of Outliers. London, UK: Chapman and Hall, 1980 [15] Algazi V R, Duda R O, Thompson D M, et al. The CIPIC HRTF Database // Proc of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.New York, USA, 2001: 99-102 [16] Duraiswami R, Raykar V C. The Manifolds of Spatial Hearing // Proc of the International Conference on Acoustics, Speech, and Signal Processing. Philadelphia, USA, 2005: 285-288