|
|
Feature Selection Based on Adaptive Whale Optimization Algorithm and Fault-Tolerance Neighborhood Rough Sets |
SUN Lin1,2, HUANG Jinxu1, XU Jiucheng1, MA Yuanyuan1 |
1. College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007; 2. Henan Engineering Laboratory of Smart Business and Internet of Things Technology, Henan Normal University, Xinxiang 453007 |
|
|
Abstract Traditional whale optimization algorithm(WOA) cannot handle continuous data effectively, and the tolerance of neighborhood rough sets(NRS) for noise data is poor. To address the issues, an algorithm of feature selection based on adaptive WOA and fault-tolerance NRS is presented. Firstly, a piecewise dynamic inertia weight based on iteration cycle is proposed to prevent the WOA from falling into local optimum prematurely. The shrinkage enveloping and spiral predation behaviors of WOA are improved, and an adaptive WOA is designed. Secondly, the ratio of the same decision features in the neighborhood is introduced to make up for the fault tolerance lack of NRS model for noise data, and the upper and lower approximations, approximation precision and approximation roughness, fault-tolerance dependence and approximation conditional entropy of fault-tolerance neighborhood are defined. Finally, a fitness function is constructed based on the fault-tolerance NRS, and then the adaptive WOA searches for the optimal feature subset through continuous iterations. The Fisher score is employed to reduce the dimensions of high-dimensional datasets preliminarily and the time complexity of the proposed algorithm effectively. The proposed algorithm is tested on 8 low-dimensional UCI datasets and 6 high-dimensional gene datasets. Experimental results demonstrate that the proposed algorithm selects fewer features effectively with high classification accuracy.
|
Received: 22 September 2021
|
|
Fund:Supported by National Natural Science Foundation of China(No.62076089,61976082,62002103), Key Scientific and Technology Program of Henan Province(No.212102210136) |
Corresponding Authors:
SUN Lin, Ph.D., associate professor. His research interests include granular computing, big data mining and bioinformatics.
|
About author:: HUANG Jinxu, master student. His research interests include data mining.XU Jiucheng, Ph.D., professor. His research interests include granular computing, big data mining and intelligent information processing.MA Yuanyuan, Ph.D., associate profe-ssor. Her research interests include granular computing and intelligent information proce-ssing. |
|
|
|
[1] SUN L, WANG L Y, DING W P, et al. Feature Selection Using Fuzzy Neighborhood Entropy-Based Uncertainty Measures for Fuzzy Neighborhood Multigranulation Rough Sets. IEEE Transactions on Fuzzy Systems, 2021, 29(1): 19-33. [2] DING W P, LIN C T, CAO Z H.Deep Neuro-Cognitive Co-evolution for Fuzzy Attribute Reduction by Quantum Leaping PSO with Nearest-Neighbor Memeplexes. IEEE Transactions on Cybernetics, 2019, 49(7): 2744-2757. [3] 邓威,郭钇秀,李勇,等.基于特征选择和Stacking集成学习的配电网网损预测.电力系统保护与控制, 2020, 48(15): 108-115. (DENG W, GUO Y X, LI Y, et al. Power Losses Prediction Based on Feature Selection and Stacking Integrated Learning. Power System Protection and Control, 2020, 48(15): 108-115.) [4] YUE X D, ZHOU J, YAO Y Y, et al. Shadowed Neighborhoods Based on Fuzzy Rough Transformation for Three-Way Classification. IEEE Transactions on Fuzzy Systems, 2020, 28(5): 978-991. [5] 薛占熬,庞文莉,姚守倩,等.基于前景理论的直觉模糊三支决策模型.河南师范大学学报(自然科学版), 2020, 48(5): 31-36, 79. (XUE Z A, PANG W L, YAO S Q, et al. The Prospect Theory Based Intuitionistic Fuzzy Three-Way Decisions Model. Journal of Henan Normal University (Natural Science Edition), 2020, 48(5): 31-36, 79.) [6] 韩素敏,郑书晴,何永盛.基于粗糙集贪心算法的逆变器开路故障诊断.电力系统保护与控制, 2020, 48(17): 122-130. (HAN S M, ZHENG S Q, HE Y S.Open Circuit Fault Diagnosis for Inverters Based on a Greedy Algorithm of a Rough Set. Power System Protection and Control, 2020, 48(17): 122-130.) [7] 刘艳,程璐,孙林.基于K-S检验和邻域粗糙集的特征选择方法.河南师范大学学报(自然科学版), 2019, 47(2): 21-28. (LIU Y, CHENG L, SUN L.Feature Selection Method Based on K-S Test and Neighborhood Rough Sets. Journal of Henan Normal University(Natural Science Edition), 2019, 47(2): 21-28.) [8] CHEN H M, LI T R, FAN X, et al. Feature Selection for Imba-lanced Data Based on Neighborhood Rough Sets. Information Sciences, 2019, 483: 1-20. [9] SUN L, WANG T X, DING W P, et al. Feature Selection Using Fisher Score and Multilabel Neighborhood Rough Sets for Multilabel Classification. Information Sciences, 2021, 578: 887-912. [10] 彭潇然,刘遵仁,纪俊.基于容错改进的邻域粗糙集属性约简算法.计算机应用研究, 2018, 35(8): 2256-2259, 2314. (PENG X R, LIU Z R, JI J.Attribute Reduction Algorithm Based on Fault-Tolerance Improvement of Neighborhood Rough Set. Application Research of Computers, 2018, 35(8): 2256-2259, 2314.) [11] HU C X, ZHANG L, WANG B J, et al. Incremental Updating Knowledge in Neighborhood Multigranulation Rough Sets under Dynamic Granular Structures. Knowledge-Based Systems, 2018, 163: 811-829. [12] WANG C Z, HUANG Y, SHAO M W, et al. Fuzzy Rough Set-Based Attribute Reduction Using Distance Measures. Knowledge-Based Systems, 2019, 164: 205-212. [13] SUN L, WANG L Y, DING W P, et al. Neighborhood Multi-gra-nulation Rough Sets-Based Attribute Reduction Using Lebesgue and Entropy Measures in Incomplete Neighborhood Decision Systems. Knowledge-Based Systems, 2020. DOI: 10.1016/j.knosys.2019.105373. [14] SUN L, ZHANG X Y, QIAN Y H, et al. Joint Neighborhood Entropy-Based Gene Selection Method with Fisher Score for Tumor Classification. Applied Intelligence, 2019, 49(4): 1245-1259. [15] WANG G Y.Rough Reduction in Algebra View and Information View. International Journal of Intelligent Systems, 2003, 18(6): 679-688. [16] SUN L, YIN T Y, DING W P, et al. Multilabel Feature Selection Using ML-Relieff and Neighborhood Mutual Information for Multilabel Neighborhood Decision Systems. Information Sciences, 2020, 537: 401-424. [17] SUN L, WANG L Y, QIAN Y H, et al. Feature Selection Using Lebesgue and Entropy Measures for Incomplete Neighborhood Decision Systems. Knowledge-Based Systems, 2019, 186. DOI: 10.1016/j.knosys.2019.104942. [18] 刘琨,封硕.加强局部搜索能力的人工蜂群算法.河南师范大学学报(自然科学版), 2021, 49(2): 15-24. (LIU K, FENG S.An Improve Artificial Bee Colony Algorithm for Enhancing Local Search Ability. Journal of Henan Normal University(Natural Science Edition), 2021, 49(2): 15-24.) [19] 刘振,郭恒光,任建存.一种局部搜索能力增强的狮群算法.河南师范大学学报(自然科学版), 2019, 47(3): 35-41. (LIU Z, GUO H G, REN J C.An Enhanced Local Search Lion Optimization Algorithm. Journal of Henan Normal University(Natural Science Edition), 2019, 47(3): 35-41.) [20] MIRJALILI S, LEWIS A.The Whale Optimization Algorithm. Advances in Engineering Software, 2016, 95: 51-67. [21] WANG X Y, YANG J, TENG X L, et al. Feature Selection Based on Rough Sets and Particle Swarm Optimization. Pattern Recognition Letters, 2007, 28(4): 459-471. [22] 孙林,赵婧,徐久成,等.基于邻域粗糙集和帝王蝶优化的特征选择算法[J/OL]. [2021-08-21]. https://kns.cnki.net/kcms/detail/51.1307.TP.20210928.1342.002.html. (SUN L, ZHAO J, XU J C, ,et al. Feature Selection Algorithm Based on Neighborhood Rough Sets. Feature Selection Algorithm Based on Neighborhood Rough Sets and Monarch Butterfly Optimization[J/OL]. [2021-08-21]. https://kns.cnki.net/kcms/detail/51.1307.TP.20210928.1342.002.html.) [23] 褚鼎立,陈红,王旭光.基于自适应权重和模拟退火的鲸鱼优化算法.电子学报, 2019, 47(5): 992-999. (CHU D L, CHEN H, WANG X G.Whale Optimization Algorithm Based on Adaptive Weight and Simulated Annealing. Acta Electronica Sinica, 2019, 47(5): 992-999.) [24] ABHISHEK B, RADHA T G, KUNTAL M.A Feature Selection Technique Based on Rough Set and Improvised PSO Algorithm(PSORS-FS) for Permission Based Detection of Android Malwares. International Journal of Machine Learning and Cybernetics, 2019, 10(7): 1893-1907. [25] 汤安迪,韩统,徐登武,等.混沌精英哈里斯鹰优化算法.计算机应用, 2021, 41(8): 2265-2272. (TANG A D, HAN T, XU D W, et al. Chaotic Elite Harris Hawks Optimization Algorithm. Journal of Computer Applications, 2021, 41(8): 2265-2272.) [26] FARAMAIZI A, HEIDARINEJAD M, MIRJALILI S, et al. Marine Predators Algorithm: A Nature-Inspired Metaheuristic. Expert Systems with Applications, 2020, 152. DOI: 10.1106/j.eswa.2020.113377. [27] ESKANDAR H, SADOLLAH A, BAHREININEJAD A, ,et al. Wa-ter Cycle Algorithm-A Novel Metahertistic Optimization Method for Solving Constrained Engineering Optimization Problems. Computer. Wa-ter Cycle Algorithm-A Novel Metahertistic Optimization Method for Solving Constrained Engineering Optimization Problems. Computer and Structures, 2021, 110/111: 151-166. [28] SUN L, WANG L Y, XU J C, et al. A Neighborhood Rough Sets-Based Attribute Reduction Method Using Lebesgue and Entropy Measures. Entropy, 2019, 21(2). DOI: 10.3390/e21020138. [29] PAUL A, SIL J, MUKHOPADHYAY C D.Gene Selection for Designing Optimal Fuzzy Rule Base Classifier by Estimating Missing Value. Applied Soft Computing, 2017, 55: 276-288. [30] WANG D, CHEN H M, LI T R, et al. A Novel Quantum Grasshopper Optimization Algorithm for Feature Selection. International Journal of Approximate Reasoning, 2020, 127: 33-53. [31] LONG N C, MEESAD P, UNGER H.Attribute Reduction Based on Rough Sets and the Discrete Firefly Algorithm//Proc of the 10th International Conference on Computing and Information Technology. Berlin, Germany: Springer, 2014: 13-22. [32] CHEN Y M, ZHU Q X, XU H R.Finding Rough Set Reducts with Fish Swarm Algorithm. Knowledge-Based Systems, 2015, 81: 22-29. [33] ZOUACHE D, ABDELAZIZ F B.A Cooperative Swarm Intelligence Algorithm Based on Quantum-Inspired and Rough Sets for Feature Selection. Computers and Industrial Engineering, 2018, 115: 26-36. [34] ZHAO Z, LIU H.Searching for Interacting Features in Subset Selection. Intelligent Data Analysis, 2009, 13(2): 207-228. [35] LEE C, LEE G G.Information Gain and Divergence Based Feature Selection for Machine Learning Based Text Categorization. Information Processing and Management, 2006, 42(1): 155-165. [36] CHEN D G, ZHANG L, ZHAO S Y, et al. A Novel Algorithm for Finding Reducts with Fuzzy Rough Sets. IEEE Transactions on Fuzzy Systems, 2012, 20(2): 385-389. [37] QIAN Y H, WANG Q, CHENG H H, et al. Fuzzy-Rough Feature Selection Accelerator. Fuzzy Sets and Systems, 2015, 258: 61-78. [38] JENSEN R, SHEN Q.New Approaches to Fuzzy-Rough Feature Selection. IEEE Transactions on Fuzzy Systems, 2009, 17(4): 824-838. [39] TAN A H, WU W Z, QIAN Y H, et al. Intuitionistic Fuzzy Rough Set-Based Granular Structures and Attribute Subset Selection. IEEE Transactions on Fuzzy Systems, 2019, 27(3): 527-539. [40] SUN L, YIN T Y, DING W P, et al. Feature Selection with Mi-ssing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy. IEEE Transactions on Fuzzy Systems, 2021. DOI: 10.1109/TFUZZ.2021.3053844. |
|
|
|