Abstract:Attribute selection is an effective data preprocessing method. It can keep temporal relations of important attributes of multivariate time series and their actual physical meanings. Aiming at the problem that the actual data lacks the classified information, an unsupervised attribute selection method is proposed and its time complexity is analyzed.Firstly,a method for computing the fractal dimension of multivariate time series is proposed, and there is no need for the proposed method to reconstruct the phase space. The fractal dimension is considered as the essential dimension by the proposed method. Therefore,the changing of the attributes number and the fractal dimension of attribute subsets are regarded as the evaluation criterion of attribute subsets. To solve the combinatorial explosion problem in high dimensional search space, the discrete particle swarm optimization algorithm is improved. Finally, the results of numerical simulations of multivariate time series from the typical chaotic dynamic system and five datasets of UCI database confirm the effectiveness of the proposed algorithm.Moreover, experimental results show the proposed algorithm finds out better attributes sets in shorter time and achieves better integrative performance.
[1] Mao Yong, Zhou Xiaobo, Xia Zheng, et al. A Survey for Study of Feature Selection Algorithm. Pattern Recognition and Artificial Intelligence, 2007, 20(2): 211-217 (in Chinese) (毛 勇,周晓波,夏 铮,等.特征选择算法研究综述.模式识别与人工智能, 2007, 20(2): 211-217) [2] Yang Fengzhao. Research on High Dimensional Data Mining. Nanjing, China: Southeast University Press, 2007 (in Chinese) (杨风召.高维数据挖掘技术研究.南京:东南大学出版社, 2007) [3] Traina C Jr, Traina A, Wu L, et al. Fast Feature Selection Using Fractal Dimension // Proc of the 15th Brazilian Symposium on Databases. Paraiba, Brazil, 2000: 158-171 [4] Lee H D, Monard M C, Wu F C. Feature Subset Selection for Supervised Learning Using Fractal Dimension // Proc of the Confe-rence on Advances in Logic Based Intelligent Systems: Selected Papers of LAPTEC. Himegi, Japan, 2005: 135-142 [5] Eneva E, Kumaraswamy K, Matteucci M. WEK: A Study in Fractal Dimension and Dimensionality Reduction[EB/OL].[2012-11-01]. http://www.cs.cmu.edu/~skkumar/papers/wekkem-workshop.pdf [6] Yan Guanghui, Li Zhanhuai. Unsupervised Dimensionality Reduction Based on Fractal Dimension and Genetic Algorithm. Computer Engineering and Application, 2008, 44(10): 23-27 (in Chinese) (闫光辉,李战怀.基于遗传算法的无监督分形属性规约技术.计算机工程与应用, 2008, 44(10): 23-27) [7] Yu Shiwei, Wei Yiming, Zhu Kejun. Hybrid Optimization Algorithm Based on Particle Swarm Optimization and Genetic Algorithm. Systems Engineering and Electronics, 2011, 33(7): 1647-1652 (in Chinese) (於世为,魏一鸣,诸克军.基于粒子群-遗传的混合优化算法.系统工程与电子技术, 2011, 33(7): 1647-1652) [8] Theiler J. Estimating Fractal Dimension. Journal of the Optical Society of America A, 1990, 7(6): 1055-1073 [9] Trelea I C. The Particle Swam Optimization Algorithm: Convergence Analysis and Parameter Selection. Information Processing Letters, 2003, 85(6): 317-325 [10] Chvea E, Navarro G, Baeza-Yates R, et al. Proximity Searching in Metric Spaces. ACM Computing Surveys, 2001, 33(3): 273-321 [11] Mandelbort B B. The Fractal Geometry of Nature. New York, USA: Freeman, 1982 [12] Faiconer K J. The Hausdorff Dimension of Self-Affine Fractals. Mathematical Proc of the Cambridge Philosophical Society, 1988, 103(2): 339-350 [13] Zhang Jizhong. The Fractal. 2nd Edition. Beijing, China: Tsinghua University Press, 2011 (in Chinese) (张济忠.分形.第2版.北京:清华大学出版社, 2011) [14] Ni Liping. Research on Financial Data Analytical Method Based on Fractal Technology. Ph.D Dissertation. Hefei, China: Hefei University of Technology, 2010 (in Chinese) (倪丽萍.基于分形技术的金融数据分析方法研究.博士学位论文.合肥:合肥工业大学, 2010) [15] Kennedy J, Eberhart R C. Particle Swarm Optimization // Proc of the IEEE International Conference on Neural Network. Perth, Australia, 1995, IV: 1942-1948 [16] Eberhart R C, Shi Yuhui. Particle Swarm Optimization: Developments, Application and Resources // Proc of the Congress on Evolutionary Computation. Michigan, USA, 2001, I: 81-86 [17] Tumer M B, Demir M C. A Genetic Approach to Data Dimensionality Reduction Using a Special Initial Population // Proc of the 1st International Work-Conference on the Interplay between Natural and Artificial Computation. Las Palmas, Spain, 2005, II: 310-316 [18] Liu Jingxia, Hu Binxin, Shan Huaning. Method of Computing Correlation Dimension Based on Wavelet Packet Transform. Journal of PLA University of Science and Technology: Natural Science Edition, 2006, 7(3): 229-231 (in Chinese) (刘景夏,胡冰新,单华宁.一种基于小波包变换的关联维数计算方法.解放军理工大学学报:自然科学版, 2006, 7(3): 229-231) [19] Tao Yukun. Nonlinear Systematic Methods and Dynamic Complex Networks for Medical Data Analysis and Integration. Master Dissertation. Hangzhou, China: Zhejiang University, 2010 (in Chinese) (陶熠昆.基于非线性系统分析与动态复杂网络的医学数据分析与集成.硕士学位论文.杭州:浙江大学, 2010) [20] Liu Huang. The Research on GA-Based Improved Particle Swarm Optimization and Its Application in TSP. Master Dissertation. Wuhan, China: Wuhan University of Technology, 2010 (in Chinese) (刘 煌.基于GA的改进粒子群算法研究及其在TSP上的应用.硕士学位论文.武汉:武汉理工大学, 2010)