Feature Selection Algorithm Based on Neighborhood Valued Tolerance Relation Rough Set Model
YAO Sheng1,2, XU Feng2, ZHAO Peng1,2, WANG Jie2, CHEN Ju2
1.Key Laboratory of Intelligent Computing and Signal Processing, Ministry of Education,Anhui University, Hefei 230039 2. College of Computer Science and Technology, Anhui University, Hefei 230601
Abstract:The existing methods of feature selection are mostly based on tolerance relation in the numerical incomplete information system.However, the data similarity characterization is too loose in these approaches. Therefore, the rough set model of neighborhood valued tolerance relation is proposed in this paper. The neighborhood valued tolerance condition entropy is defined on the basis of the model. And the related properties are analyzed.Finally, the corresponding algorithm is constructed according to the monotonicity of neighborhood valued tolerance condition entropy. Experimental results show that the proposed algorithm is superior to the existing algorithms in terms of the feature selection results, arithmetic operation time and classification accuracy.
[1] YANG B J, ZHANG T L. A Scalable Feature Selection and Model Updating Approach for Big Data Machine Learning // Proc of the IEEE International Conference on Smart Cloud. Washington, USA: IEEE, 2016: 146-151. [2] Hall M A. Correlation-Based Feature Selection for Discrete and Numeric Class Machine Learning // Proc of the 17th International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann Publishers, 2000: 359-366. [3] HU Q H, YU D R, LIU J F, et al. Neighborhood Rough Set Based Heterogeneous Feature Subset Selection. Information Sciences, 2008, 178(18): 3577-3594. [4] 翟俊海,刘 博,张素芳.基于相对分类信息熵的进化特征选择算法.模式识别与人工智能, 2016, 29(8): 682-690. (ZHAI J H, LIU B, ZHANG S F.Feature Selection via Evolutionary Computation Based on Relative Classification Information Entropy. Pattern Recognition and Artificial Intelligence, 2016, 29(8): 682-890.) [5] PAWLAK Z. Rough Sets. International Journal of Parallel Progra-mming, 1982, 11(5): 341-356. [6] 陈 昊,杨俊安,庄镇泉.变精度粗糙集的属性核和最小属性约简算法.计算机学报, 2012, 35(5): 1011-1017. (CHEN H, YANG J A, ZHUANG Z Q. The Core of Attri-butes and Minimal Attributes Reduction in Variable Precision Rough Set. Chinese Journal of Computers, 2012, 35(5): 1011-1017.) [7] MENG Z Q, SHI Z Z. A Fast Approach to Attribute Reduction in Incomplete Decision Systems with Tolerance Relation-Based Rough Sets. Information Sciences, 2009, 179(16): 2774-2793. [8] QIAN Y H, LIANG J Y, PEDRYCZ W, et al. Positive Approximation: An Accelerator for Attribute Reduction in Rough Set Theory. Artificial Intelligence, 2010, 174(9/10): 597-618. [9] KRYSZKIEWICZ M. Rough Set Approach to Incomplete Information Systems. Information Sciences, 1998, 112(1/2/3/4): 39-49. [10] STEFANOWSKI J, TSOUKIS A. On the Extension of Rough Sets under Incomplete Information // Proc of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. London, UK: Springer-Verlag, 1999: 73-81. [11] SUN L, XU J C, TIAN Y. Feature Selection Using Rough Entropy-Based Uncertainty Measures in Incomplete Decision Systems. Knowledge-Based Systems, 2012, 36: 206-216. [12] DAI J H, WANG W T, TIAN H W, et al. Attribute Selection Based on a New Conditional Entropy for Incomplete Decision Systems. Knowledge-Based Systems, 2013, 39: 207-213. [13] ZHAO H, QIN K Y. Mixed Feature Selection in Incomplete Decision Table. Knowledge-Based Systems, 2014, 57: 181-190. [14] DAI J H. Rough Set Approach to Incomplete Numerical Data. Information Sciences: An International Journal, 2013, 241: 43-57. [15] WANG G Y, GUAN L H, WU W Z, et al. Data-Driven Valued Tolerance Relation Based on the Extended Rough Set. Fundamenta Informaticae, 2014, 132(3): 349-363. [16] LIN T Y. Rough Sets, Neighborhood Systems and Approximation. World Journal of Surgery, 1986, 10(2): 189-194. [17] LIANG J Y, SHI Z, LI D, et al. Information Entropy, Rough Entropy and Knowledge Granulation in Incomplete Information Systems. International Journal of General Systems, 2006, 35(6): 641-654. [18] QIAN Y H, LIANG J Y. Combination Entropy and Combination Granulation in Incomplete Information System // Proc of the 1st International Conference on Rough Sets and Knowledge Technology. Berlin, Germany: Springer-Verlag, 2006: 184-190. [19] QIAN Y H, LIANG J Y, WANG F. A New Method for Measuring the Uncertainty in Incomplete Information Systems. International Journal of Uncertainty Fuzziness and Knowledge-Based Systems, 2009, 17(6): 855-880. [20] HU Q H, PEDRYCZ W, YU D R, et al. Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization. IEEE Transactions on Systems, Man, and Cybernetics(Cybernetics), 2010, 40(1): 137-150. [21] LIU Y, HUANG W L, JIANG Y L, et al. Quick Attribute Reduct Algorithm for Neighborhood Rough Set Model. Information Sciences, 2014, 271: 65-81.