基于关联规则的特征选择算法<sup>*</sup>

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (469 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要关联规则能够发现数据库中属性之间的关联，通过优先选择短规则用于相关属性的选择，有可能得到最小的属性子集.基于此，本文提出一种基于关联规则的特征选择算法，实验结果表明在属性子集大小和分类精度上优于多种特征选择方法.同时，对支持度和置信度对算法效果的影响进行探索，结果表明高的支持度和置信度并不导致高的分类精度和小的特征子集，而充足的规则数是基于关联规则特征选择算法高效的必要条件.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	武建华
	宋擒豹
	沈均毅
	谢建文

关键词 ：特征选择, 特征子集, 关联规则, 分类

Abstract：A feature selection algorithm based on association rules is presented, and the impact of support and confidence on the presented method are studied. The experimental results show that the feature subset size and classification accuracy of the presented method are better than those of other methods. Furthermore, the results indicate high support and confidence levels do not guarantee high classification accuracy and small feature subset, and the sufficient number of rules is the precondition for high efficiency of feature selection based on association rules.

Key words： Feature Selection Feature Subset Association Rules Classification

收稿日期: 2008-06-25

ZTFLH:

TP391

基金资助:国家自然科学基金资助项目(No.60673124,60673087)

作者简介: 武建华，女，1963年生，副教授，主要研究方向为数据挖掘.E-mail: tjhwu@jnu.edu.cn.宋擒豹，男，1966年生，教授，博士生导师，主要研究方向为数据挖掘、软件工程.沈均毅，男，1939年生，教授，博士生导师，主要研究方向为数据挖掘、数据库理论.谢建文，男，1985年生，学士.

引用本文:

武建华，宋擒豹，沈均毅，谢建文. 基于关联规则的特征选择算法^*[J]. 模式识别与人工智能, 2009, 22(2): 256-262. WU Jian-hua, SONG Qin-Bao, SHEN Jun-Yi, XIE Jian-Wen. Feature Selection Algorithm Based on Association Rules. , 2009, 22(2): 256-262.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2009/V22/I2/256

[1] John G H, Kohavi R, Pfleger K. Irrelevant Features and the Subset Selection Problem // Proc of the 11th International Conference on Machine Learning. New Brunswick, USA, 1994: 121-129
[2] Koller D, Sahami M. Toward Optimal Feature Selection // Proc of the International Conference on Machine Learning. Bari, Italy, 1996: 284-292
[3] Dash M, Liu H. Feature Selection for Classification. Intelligent Data Analysis, 1997, 1(3): 131-156
[4] Kira K, Rendell L A. The Feature Selection Problem: Traditional Methods and a New Algorithm //Proc of the 9th National Conference on Artificial Intelligence. San Jose, USA, 1991: 129-134
[5]Kononenko I. Estimating Attributes: Analysis and Extension of Relief // Proc of the European Conference on Machine Learning. Catania, Italy, 1994: 171-182
[6] Michalewicz Z. Genetic Algorithms+Data Structures=Evolution Programs. New York, USA: Springer-Verlag, 1996
[7] Siedlecki W, Sklansky J. On Automatic Feature Selection. International Journal of Pattern Recognition and Artificial Intelligence, 1988, 2(2): 197-220
[8] Vafaie H, de Jong K. Genetic Algorithm as a Tool for Feature Selection in Machine Learning //Proc of the 4th International Conference on Tools with Artificial Intelligence. Arlington, USA, 1992: 200-204
[9] Jain A, Zongker D. Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Trans on Pattern Analysis and Machine Intelligence, 1997, 19(2): 153-158
[10] Martin-Bautista M J, Vila M A. A Survey of Genetic Feature Selection in Mining Issues // Proc of the Congress on Evolutionary Computation. Washington, USA, 1999, Ⅱ: 13-23
[11]Waske B, Schiefers S, Braun M. Random Feature Selection for Decision Tree Classification of Multi-Temporal SAR Data // Proc of the IEEE International Geoscience and Remote Sensing Symposium. Denver, USA, 2006: 168-171
[12] Tian D, Keane J, Zeng Xiaojun. Evaluating the Effect of Rough Set Feature Selection on the Performance of Decision Trees // Proc of the IEEE International Conference on Granular Computing. Atlanta, USA, 2006: 57-62
[13] Chen Huanhuan, Yao Xin. Evolutionary Multiobjective Ensemble Learning Based on Bayesian Feature Selection // Proc of the IEEE Congress on Evolutionary Computation. Vancouver, USA, 2006: 267-274
[14] Wiratunga N, Lothian R, Massie S. Unsupervised Textual Feature Selection // Proc of the 8th European Conference on Case-Based Reasoning. Fethiye, Turkey, 2006: 340-354
[15] Wiratunga N, Koychev I, Massie S. Feature Selection and Generalisation for Retrieval of Textual Cases // Proc of the 7th European Conference on Case-Based Reasoning. Madrid, Spain, 2004: 806-820
[16] Agrawal R, Imilinski T, Swami A. Mining Association Rules between Sets of Items in Large Database // Proc of the ACM SIGMOD Conference on Management of Data. Washington, USA, 1993: 207-216
[17] University of California-Irvine. UCI Repository of Machine Learning Databases [DB/OL]. [2008-05-06]. http://www.ics.uci.edu/ mlearn/-MLRepository.html