Included Angle Distance of Time Series and Similarity Search
ZHANG Peng1 , LI Xue-Ren3, ZHANG Jian-Ye2,3, ZHANG Zong-Lin1
1.Engineering Institute, Air Force Engineering University, Xi'an 7100382. College of Automation, Northwestern Polytechnical University, Xi'an 7100723. Science Research Department, Air Force Engineering University, Xi'an 710051
Abstract:A method for time series approximation representation and similar measurement is proposed. Based on the adaptive piecewise linear representation, the time series are represented approximately with a sequence of the included angles between a pair of neighboring line segments. The basic concepts and properties of the included angle distance are proposed and proved. The included angle distance overcomes the problem when the point distance is used as the similar measurement, such as the poor robustness and ambiguous concepts. The proposed method is also invariant to translation and rotation. Experimental results on synthetic data and stock data show that the proposed method is effective.
张鹏,李学仁,张建业,张宗麟. 时间序列的夹角距离及相似性搜索*[J]. 模式识别与人工智能, 2008, 21(6): 763-767.
ZHANG Peng , LI Xue-Ren, ZHANG Jian-Ye, ZHANG Zong-Lin. Included Angle Distance of Time Series and Similarity Search. , 2008, 21(6): 763-767.
[1] Keogh E, Kasetty S. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery, 2003, 7(4): 349-371 [2] Li Aiguo, Qin Zheng. Dimensionality Reduction and Similarity Search in Large Time Series Databases.Chinese Journal of Computers, 2005, 28(9): 1467-1475 (in Chinese) (李爱国,覃 征.大规模时间序列数据库降维及相似搜索.计算机学报, 2005, 28(9): 1467-1475) [3] Yi B K, Faloutsos C. Fast Time Sequence Indexing for Arbitrary Lp Norms // Proc of the 26th International Conference on Very Large Databases. Cairo, Egypt, 2000: 385-394 [4] Pavlidis T, Horowitzs S L. Segmentation of Plane Curves. IEEE Trans on Computation, 1974, 23(8): 860-870 [5] Goldina D Q, Millsteinb T D, Kutlua A. Bounded Similarity Querying for Time-Series Data. Information and Computation, 2004, 194(2): 203-241 [6] Wang Da, Rong Gang. Pattern Distance of Time Series. Journal of Zhejiang University: Engineering Science, 2004, 38(7), 795-798 (in Chinese) (王 达,荣 冈.时间序列的模式距离.浙江大学学报:工学版, 2004, 38(7): 795-798) [7] Lee S J, Kwon D, Lee S. Minimum Distance Queries for Time Series Data. Journal of Systems and Software, 2004, 69(1/2): 105-113 [8] Vlachos M, Kollios G, Gunopulos D. Discovering Similar Multidimensional Trajectories // Proc of the 18th International Conference on Data Engineering. San Jose, USA, 2002: 673-684 [9] Keogh E J, Pazzani M J. An Indexing Scheme for Fast Similarity Search in Large Time Series Database // Proc of the 11th International Conference on Scientific and Statistical Database Management. Cleveland, USA, 1999: 56-67 [10] Keogh E, Pazzani M. An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification // Proc of the 4th International Conference of Knowledge Discovery and Data Mining. Melbourne, Australia, 1998: 239-241 [11] Keogh E. Fast Similarity Search in the Presence of Longitudinal Scaling in Time Series Database // Proc of the 9th International Conference on Tools with Artificial Intelligence. Newport Beach, USA, 1997: 578-584 [12] Zhang Jianye, Pan Quan, Zhang Peng, et al. Similarity Measuring Method in Time Series Based on Slope.Pattern Recognition and Artificial Intelligence, 2007, 20(2): 271-274 (in Chinese) (张建业,潘 泉,张 鹏,等.基于斜率表示的时间序列相似性度量方法.模式识别与人工智能, 2007, 20(2): 271-274) [13] Zhan Yanyan, Xu Rongcong, Chen Xiaoyun. Time Series Piecewise Linear Representation Based on Slope Extract Edge Point. Computer Science, 2006, 33(11): 139-142,162 (in Chinese) (詹艳艳, 徐荣聪, 陈晓云. 基于斜率提取边缘点的时间序列分段线性表示方法.计算机科学, 2006, 33(11): 139-142,162)