Multi-label Feature Selection Based on Information Granulation
WANG Chenxi1,2, LIN Yaojin1,2, TANG Li1, FU Wei1, LIN Peirong1
1.School of Computer Science and Engineering, Minnan Normal University, Zhangzhou 363000
2.Key Laboratory of Data Science and Intelligence Application, Zhangzhou 363000
Feature selection is to select a subset of features from the original feature space to yield similar or better learning performance compared with the original feature set in the task of classification. In this paper, an information granulation based multi-label feature selection is firstly proposed. Then, the label weight and sample average margin is fused. Finally, the improved neighborhood information entropy is applied to multi-label feature selection. Experiments are conducted on six datasets and five evaluation metrics, and experimental results show that the proposed algorithm is effective.
王晨曦, 林耀进, 唐莉, 傅为, 林培榕. 基于信息粒化的多标记特征选择算法[J]. 模式识别与人工智能, 2018, 31(2): 123-131.
WANG Chenxi, LIN Yaojin, TANG Li, FU Wei, LIN Peirong. Multi-label Feature Selection Based on Information Granulation. , 2018, 31(2): 123-131.
[1] SCHAPIRE R E, SINGER Y. BoosTexter: A Boosting-Based System for Text Categorization. Machine Learning, 2000, 39(2/3): 135-168.
[2] ZHANG M L, ZHOU Z H. Multi Label Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(10): 1338-1351.
[3] BOUTELL M R, LUO J B, SHEN X P, et al. Learning Multi-label Scene Classification. Pattern Recognition, 2004, 37(9): 1757-1771.
[4] 何志芬,杨 明,刘会东.多标记分类和标记相关性的联合学习.软件学报, 2014, 25(9): 1967-1981.
(HE Z F, YANG M, LIU H D. Joint Learning of Multi-label Cla-ssification and Label Correlation. Journal of Software, 2014, 25(9): 1967-1981.)
[5] ZHANG L J, HU Q H, DUAN J, et al. Multi-label Feature Selection with Fuzzy Rough Sets // Proc of the International Conference on Rough Sets and Knowledge Technology. Berlin, Germany: Springer, 2014: 121-128.
[6] HOTELLING H. Relations between Two Sets of Variates. Biometrika, 1936, 28(3/4): 321-377.
[7] ZHANG Y, ZHOU Z H. Multi-label Dimensionality Reduction via Dependence Maximization // Proc of the 23rd International Confe-rence on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2008, III: 1503-1505.
[8] YU K, YU S P, TRESP V. Multi-label Informed Latent Semantic Indexing // Proc of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2005: 258-265.
[9] LIN Y J, HU Q H, LIU J H, et al. Multi-label Feature Selection Based on Max-Dependency and Min-Redundancy. Neurocomputing, 2015, 168: 92-103.
[10] LIN Y J, HU Q H, LIU J H, et al. Multi-label Feature Selection Based on Neighborhood Mutual Information. Applied Soft Computing, 2016, 38: 244-256.
[11] 刘景华,林梦雷,王晨曦,等.基于局部子空间的多标记特征选择算法.模式识别与人工智能, 2016, 29(3): 240-251.
(LIU J H, LIN M L, WANG C X, et al. Multi-label Feature Selection Algorithm Based on Local Subspace. Pattern Recognition and Artificial Intelligence, 2016, 29(3): 240-251.)
[12] LI F, MIAO D Q, PEDRYC Z. Granular Multi-label Feature Selection Based on Mutual Information. Pattern Recognition, 2017, 67: 410-423.
[13] 段 洁,胡清华,张灵均,等.基于邻域粗糙集的多标记分类特征选择算法.计算机研究与发展, 2015, 52(1): 56-65.
(DUAN J, HU Q H, ZHANG L J, et al. Feature Selection for Multi-label Classification Based on Neighborhood Rough Sets. Journal of Computer Research and Development, 2015, 52(1): 56-65.)
[14] SPOLAÔR N, CHERMAN E A, MONARD M C, et al. Using ReliefF for Multi-label Feature Selection // Proc of the Brazilian Conference on Intelligent Systems. Washington, USA: IEEE, 2011: 960-975.
[15] SPOLAÔR N, CHERMAN E A, MONARD M C, et al. A Comparison of Multi-label Feature Selection Methods Using the Problem Transformation Approach. Electronic Notes in Theoretical Compu-ter Science, 2013, 292: 135-151.
[16] SPOLAÔR N, CHERMAN E A, MONARD M C, et al. ReliefF for Multi-label Feature Selection // Proc of the Brazilian Conference on Intelligent Systems. Washington, USA: IEEE, 2013: 6-11.
[17] REYES O, MORELL C, VENTURA S. Scalable Extensions of the ReliefF Algorithm for Weighting and Selecting Features on the Multi-label Learning Context. Neurocomputing, 2015, 161: 168-182.
[18] 李 娜,潘志松,周星宇.基于多标记重要性排序的分类器链算法.模式识别与人工智能, 2016, 29(6): 567-575.
(LI N, PAN Z S, ZHOU X Y. Classifier Chain Algorithm Based on Multi-label Importance Rank. Pattern Recognition and Artificial Intelligence, 2016, 29(6): 567-575.)
[19] GILAD-BACHRACH R, NAVOT A, TISHBY N. Margin Based Feature Selection-Theory and Algorithms // Proc of the 21st International Conference on Machine learning. New York, USA: ACM, 2004: 43.
[20] ZHANG M L, PEÑA J M, ROBLES V. Feature Selection for Multi-label Naive Bayes Classification. Information Sciences, 2009, 179(19): 3218-3229.
[21] ZHANG M L, ZHOU Z H. ML-KNN: A Lazy Learning Approach to Multi-label Learning. Pattern Recognition, 2007, 40(7): 2038-2048.