|
|
Clustering Method of Time Series Based on EMD and K-means Algorithm |
LIU Hui-Ting1, NI Zhi-Wei2 |
1.School of Computer Science and Technology, Anhui University, Hefei 230039 2.Institute of Computer Network System, Hefei University of Technology, Hefei 230009 |
|
|
Abstract Dimension reduction of time series and noise in sequences filtering are important prerequisites for effective realization of time series clustering. A method is proposed to preprocess time series effectively. Firstly, the trend of a time sequence is got by using empirical mode decomposition method. Then, the trend series are divided into several segments by bottom-up algorithm. Finally, the piecewise series are translated into uniform sequences, and each of them is composed of -1, 0 and 1. To prove that the proposed method can achieve dimensionality reduction and filter out the noise from the data sequence, K-means algorithm is utilized to finish clustering of pretreated time series. Experimental results show clustering of pretreated data sequences is better than that of the original series.
|
Received: 10 October 2008
|
|
|
|
|
[1] Last M, Klein Y, Kandel A. Knowledge Discovery in Time Series Databases. IEEE Trans on Systems, Man and Cybernetics , 2001, 31(1): 160-169 [2] Moon Y S, Kim J. Efficient Moving Average Transform-Based Subsequence Matching Algorithms in Time-Series Databases. Information Sciences: An International Journal, 2007, 177(23): 5415-5431 [3] Kim S W, Park D H, Lee H G. Efficient Processing of Subsequence Matching with the Euclidean Metric in Time-Series Databases. Information Processing Letters, 2004, 90(5): 253-260 [4] Kontaki M, Papadopoulos A N, Manolopoulos Y. Adaptive Similarity Search in Streaming Time Series with Sliding Windows. Data & Knowledge Engineering, 2007, 63(2): 478-502 [5] Zhang Haiqin, Cai Qingsheng. Time Series Similar Pattern Matching Based on Wavelet Transform. Chinese Journal of Computers, 2003, 26(3): 1-5 (in Chinese) (张海勤,蔡庆生.基于小波变换的时间序列相似模式匹配.计算机学报, 2003, 26(3): 1-5) [6] Zhang Min, Zhang Yanping, Cheng Jiaxing. Hierarchical Algorithm to Match Similar Time Series Pattern. Journal of Computer-Aided Design & Computer Graphics, 2005, 17(7): 1480-1485 (in Chinese) (张 旻,张燕平,程家兴.时间序列相似模式的分层匹配.计算机辅助设计与图形学学报, 2005, 17(7): 1480-1485) [7] Peng Z K, Tse P W, Chu F L. An Improved Hilbert-Huang Transform and Its Application in Vibration Signal Analysis. Journal of Sound and Vibration, 2005, 286(1/2): 187-205 [8] Keogh E, Chu S, Hart D, et al. An On-line Algorithm for Segmenting Time Series // Proc of the IEEE International Conference on Data Mining. San Jose, USA, 2001: 289-296 [9] Keogh E, Kasetty S. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration // Proc of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, 2002: 102-111 [10] Ordonez C, Omiecinski E. Efficient Disk-Based K-means Clustering for Relational Databases. IEEE Trans on Knowledge and Data Engineering, 2004, 16(8): 909-921 [11] Zhang Jianpei, Yang Yue, Yang Jing, et al. Algorithm for Initialization of K-means Clustering Center Based on Optimized-Division. Journal of System Simulation, 2009, 21(9): 2586-2590 (in Chinese) (张健沛,杨 悦,杨 静,等.基于最优划分的K-means初始聚类中心选取算法.系统仿真学报, 2009, 21(9): 2586-2590) [12] Zhang Yan, Sun Zhengxing, Li Wenhui. Texture Synthesis Based on Direction Empirical Mode Decomposition. Computers & Graphics, 2008, 32(2): 175-186 |
|
|
|