一种可伸缩的快速属性约简算法<sup>*</sup>

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (316 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要现有基于粗糙集的属性约简算法主要针对数据全部驻留在内存中的情况，因此算法不适合海量数据的约简计算、可伸缩性较差.本文借助SLIQ算法的思想并引入相应的一种数据预处理策略，由此提出一个快速的属性约简算法，其时间复杂性为O(｜U｜｜C｜).实验结果表明该算法具有良好的可伸缩性.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	吴子特
	叶东毅

关键词 ：粗糙集, 属性约简, 海量数据, 可伸缩性

Abstract：The existing rough set based attribute reduction algorithms are mainly designed for the problem of the underlying data residing in the main memory. Therefore, the limitation of their application to attribute reduction computation of huge data results in a relatively poor scalability. Inspired by supervised learning in quest (SLIQ) algorithm, a specific data pre-processing strategy is introduced and a fast attribute reduction algorithm is proposed with time complexity O(｜U｜｜C｜). The experimental results show that the proposed algorithm is of good scalability.

Key words： Rough Set Attribute Reduction Huge Data Scalability

收稿日期: 2008-04-07

ZTFLH:

TP181

基金资助:国家自然科学基金(No.60805042)、福建省自然科学基金(No.A0710006)资助项目

作者简介: 吴子特，男，1983年生，硕士研究生，主要研究方向为粗糙集理论及其应用.E-mail: zt-292@163.com.叶东毅，男，1964年生，博士，教授，主要研究方向为粗糙集理论及其应用、计算智能.

引用本文:

吴子特，叶东毅. 一种可伸缩的快速属性约简算法^*[J]. 模式识别与人工智能, 2009, 22(2): 234-239. WU Zi-Te, YE Dong-Yi. A Fast Scalable Attribute Reduction Algorithm. , 2009, 22(2): 234-239.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2009/V22/I2/234

[1] Pawlak Z. Rough Set. International Journal of Computer and Information Sciences, 1982, 11(5): 341-356
[2] Wang Guoyin. Rough Set Theory and Knowledge Acquisition. Xi'an, China: Xi'an Jiaotong University Press, 2001 (in Chinese)
(王国胤.Rough集理论与知识获取.西安:西安交通大学出版社, 2001)
[3] Hu Xiaohua, Cercone N. Learning in Relational Database: A Rough Set Approach. International Journal of Computational Intelligence, 1995, 11(2): 323-338
[4] Ye Dongyi. An Improvement to Jelonek's Attribute Reduction Algorithm. Acta Electronica Sinica, 2000, 28(12): 81-82 (in Chinese)
(叶东毅.Jelonek属性约简算法的一个改进.电子学报, 2000, 28(12): 81-82)
[5] Liu Shaohui, Sheng Qiujian, Shi Zhongzhi. A New Method for Fast Computing Positive Region. Journal of Computer Research and Development, 2003, 40(5): 637-642 (in Chinese)
(刘少辉,盛秋戬,史忠植.一种新的快速计算正区域的方法.计算机研究与发展, 2003, 40(5): 637-642)
[6] Liu Shaohui, Sheng Qiujian, Wu Bin, et al. Research on Efficient Algorithms for Rough Set Methods. Chinese Journal of Computers, 2003, 26(5): 524-529 (in Chinese)
(刘少辉,盛秋戬,吴斌,等.Rough集高效算法的研究.计算机学报, 2003, 26(5): 524-529)
[7] Hu Feng, Wang Guoyin. Analysis of the Complexity of Quick Sort for Two Dimension Table. Chinese Journal of Computers, 2007, 30(6): 963-968 (in Chinese)
(胡峰,王国胤.二维表快速排序的复杂度分析.计算机学报, 2007, 30(6): 963-968)
[8] Xu Zhangyan, Liu Zuopeng, Yang Binru, et al. A Quick Attribute Reduction Algorithm with Complexity of max(O(｜C｜｜U｜),O(｜C｜²｜U/C｜)). Chinese Journal of Computers, 2006, 29(3): 391-399 (in Chinese)
(徐章艳,刘作鹏,杨炳儒,等.一个复杂度为公式的快速属性约简算法. 计算机学报, 2006, 29(3): 391-399)
[9] Wang Guoyin, Yu Hong, Yang Dachun. Decision Table Reduction Based on Conditional Information Entropy. Chinese Journal of Computers, 2002, 25(7): 759-766 (in Chinese)
(王国胤,于洪,杨大春.基于条件信息熵的决策表约简.计算机学报, 2002, 25(7): 759-766)
[10] Mehta M, Agrawal R, Rissanen J. SLIQ: A Fast Scalable Classifier for Data Mining // Proc of the 5th International Conference on Extending Database Technology. Avignon, France, 1996: 18-32
[11] Shafer J C. Agrawal R, Mehta M. SPRINT: A Scalable Parallel Classifier for Data Mining // Proc of the 22nd International Conference on Very Large Databases. San Mateo, USA, 1996: 544-555
[12] Gehrke J, Ramakrishnan R, Ganti V. Rainforest: A Framework for Fast Decision Tree Construction of Large Datasets. Data Mining and Knowledge Discovery, 2000, 4(2/3): 127-162
[13] Ye Dongyi, Chen Zhaojiong. A New Discernibility Matrix and the Computation of a Core. Acta Electronica Sinica, 2002, 30(7): 1086-1088 (in Chinese)
(叶东毅,陈昭炯.一个新的差别矩阵及其求核方法.电子学报, 2002, 30(7): 1086-1088)
[14] Xu Zhangyan, Yang Bingru, Song Wei. Quick Computing Core Algorithm Based on Discernibility Matrix. Computer Engineering and Applications, 2006, 42(6): 4-6 (in Chinese)
(徐章艳,杨炳儒,宋威.一个基于差别矩阵的快速求核算法.计算机工程与应用, 2006, 42(6): 4-6)