|
|
Multi-label Feature Selection Algorithm Based on Local Subspace |
LIU Jinghua1, LIN Menglei1, WANG Chenxi2, LIN Yaojin1 |
1.School of Computer Science and Engineering, Minnan Normal University, Zhangzhou 363000. 2.Department of Computer Engineering, Zhangzhou Institute of Technology, Zhangzhou 363000 |
|
|
Abstract In the existing multi-label feature selection algorithms, the features with stronger relevance to label set are usually selected according to some related criteria. However, this strategy may not be the optimal option. As some features may be the key features for a few labels, but they are weakly related to the whole label set. Based on this assumption, a multi-label feature selection algorithm based on local subspace is proposed. Firstly, the mutual information between feature and label set is employed to measure the importance degree of each feature, and original feature sequences are ranked by their importance degree from high to low to obtain a new feature space. Then, the new feature space is partitioned into several subspaces, and the less redundant features are selected in each subspace by setting a sampling ratio. Finally, the final feature subset is obtained by merging all feature subsets in different subspaces. Experiment is conducted on six datasets and four evaluation criteria are used to measure the effectiveness. Experimental results show that the proposed algorithm is superior to the state-of-the-art multi-label feature selection algorithms.
|
Received: 16 January 2015
|
|
Fund:Supported by National Natural Science Foundation of China (No.61303131,61379021), Natural Science Foundation of Fujian Province (No.2013J01028), S&T Program of The Department of Education of Fujian Province (No.JA14192) |
|
|
|
[1] ZHANG M L, ZHOU Z H. A Review on Multi-label Learning Algorithms. IEEE Trans on Knowledge and Data Engineering, 2013, 26(8): 1819-1837. [2] LIU H W, MA Z J, ZHANG S C, et al. Penalized Partial Least Square Discriminant Analysis with l1-norm for Multi-label Data. Pa-ttern Recognition, 2015, 48(5): 1724-1733. [3] LEE J, KIM D W. Memetic Feature Selection Algorithm for Multi-label Classification. Information Sciences, 2015, 293: 80-96. [4] SUN L, LI S W, YE J P. Multi-label Dimensionality Reduction. Boca Raton, USA: CRC Press, 2013. [5] SPOLAR N, CHERMAN E A, MONARD M C, et al. A Comparison of Multi-label Feature Selection Methods Using the Problem Transformation Approach. Electronic Notes in Theoretical Computer Science, 2013, 292: 135-151. [6] REYES O, MORELL C, VENTURA S. Scalable Extensions of the ReliefF Algorithm for Weighting and Selecting Features on the Multi-label Learning Context. Neurocomputing, 2015, 161: 168-182. [7] GAO W, ZHOU Z H. On the Consistency of Multi-label Learning. Artificial Intelligence, 2011, 199/200: 22-44. [8] ZHANG L J, HU Q H, DUAN J, et al. Multi-label Feature Selection with Fuzzy Rough Sets // Proc of the 9th International Conference on Rough Sets and Knowledge Technology. Shanghai, China, 2014: 121-128. [9] LEE J, KIM D W. Feature Selection for Multi-label Classification Using Multivariate Mutual Information. Pattern Recognition Letters, 2013, 34(3): 349-357. [10] LEE J, KIM D W. Mutual Information-Based Multi-label Feature Selection Using Interaction Information. Expert Systems with Applications, 2015, 42(4): 2013-2025. [11] YU Y, WANG Y L. Feature Selection for Multi-label Learning Using Mutual Information and GA // Proc of the 9th International Confe-rence on Rough Sets and Knowledge Technology. Shanghai, China, 2014: 454-463. [12] 张振海,李士宁,李志刚,等.一类基于信息熵的多标签特征选择算法.计算机研究与发展, 2013, 50(6): 1177-1184. (ZHANG Z H, LI S N, LI Z G, et al. Multi-label Feature Selection Algorithm Based on Information Entropy. Journal of Computer Research and Development, 2013, 50(6): 1177-1184.) [13] ZHANG M L, WU L. LIFT: Multi-label Learning with Label-Speci-fic Features. IEEE Trans on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107-120. [14] LIN Y J, HU X G, WU X Q. Quality of Information-Based Source Assessment and Selection. Neurocomputing, 2014, 133: 95-102. [15] 杨 明,王 飞.一种基于局部随机子空间的分类集成算法.模式识别与人工智能, 2012, 25(4): 595-603. (YANG M, WANG F. A Classifier Ensemble Algorithm Based on Local Random Subspace. Pattern Recognition and Artificial Intelligence, 2012, 25(4): 595-603.) [16] TSOUMAKAS G, VLAHAVAS I. Random k-Label Sets: An Ensemble Method for Multi-label Classification // Proc of the 18th Euro-pean Conference on Machine Learning. Warsaw, Poland, 2007: 406-417. [17] ZHANG M L, PEA J M, ROBLES V. Feature Selection for Multi-label Naive Bayes Classification. Information Sciences, 2009, 179(19): 3218-3229. [18] ZHANG Y, ZHOU Z H. Multi-label Dimensionality Reduction via Dependence Maximization. ACM Trans on Knowledge Discovery from Data, 2010, 4(3): 1503-1505. [19] DOUGHERTY J, KOHAVI R, SAHAMI M. Supervised and Unsupervised Discretization of Continuous Features // Proc of the 12th International Conference on Machine Learning. Tahoe City, USA, 1995: 194-202. [20] ZHANG M L, ZHOU Z H. ML-KNN: A Lazy Learning Approach to Multi-label Learning. Pattern Recognition, 2007, 40(7): 2038-2048. |
|
|
|