改进的基于排序熵的有序决策树算法*

陈建凯,王熙照,高相辉

PDF(444 KB)
模式识别与人工智能 ›› 2014, Vol. 27 ›› Issue (2) : 134-140.
论文与报告

改进的基于排序熵的有序决策树算法*

作者信息 +

Improved Ordinal Decisions Trees Algorithms Based on Rank Entropy

Author information +
History +

摘要

由于基于排序熵的有序决策树在扩展属性选取时,需计算每个条件属性的每个割点处的排序互信息,并通过对比这些排序互信息的大小来确定最大值(最大值对应的属性为扩展属性),计算复杂度较高。针对此问题,文中将割点分为平衡割点和非平衡割点两部分,建立一个数学模型,从理论上证明排序互信息最大值不会在平衡割点处达到,而只能在非平衡割点处达到。这说明在计算排序互信息时只需遍历非平衡割点,而无需再计算平衡割点处的值,从而使决策树构建的计算效率得到较大程度提高。数值实验验证此结果。

Abstract

When the expanded attributes are selected for decision tree learning based on rank entropy, computing the rank mutual information of every single cut for each of the continuous-valued attributes is required to get the expanded attribute by comparing the values of rank mutual information. Therefore, the computational complexity is high. Aiming at this problem, cut-points are divided into stable and unstable cut-points and a mathematical model is established in this paper. The proposed model theoretically proves that the rank mutual information function achieves its maximum not at stable cut-points, but at unstable cut-points. The result means that in the algorithm only traversing the unstable cut-points is required instead of computing the values of the stable cut-points. Thus, the computational efficiency of building decision trees is greatly improved, which is confirmed by the numerical experimental results.

关键词

有序分类 / 有序决策树 / 非平衡割点

Key words

Ordinal Classification / Ordinal Decision Tree / Unstable Cut-Point

引用本文

导出引用
陈建凯 , 王熙照 , 高相辉. 改进的基于排序熵的有序决策树算法*. 模式识别与人工智能. 2014, 27(2): 134-140
CHEN Jian-Kai , WANG Xi-Zhao , GAO Xiang-Hui. Improved Ordinal Decisions Trees Algorithms Based on Rank Entropy. Pattern Recognition and Artificial Intelligence. 2014, 27(2): 134-140

参考文献

[1] Mitchell T M. Machine Learning. New York, USA: McGraw-Hill, 1997
[2] Wang X Z, Hong J R. Learning Algorithm of Decision Tree Generation for Interval-Valued Attributes. Journal of Software, 1998, 9(8): 637-640(in Chinese)
(王熙照,洪家荣.区间值属性决策树学习算法.软件学报,1998, 9(8): 637-640)
[3] Quinlan J R. Induction of Decision Tree. Machine Learning, 1986, 1(1): 81-106
[4] Wu X D, Kumar V, Quinlan J R, et al. Top 10 Algorithms in Data Mining. Knowledge and Information Systems, 2008, 14(1): 1-37
[5] Breiman L, Friedman J H, Olshen R A, et al. Classification and Regression Tree. Belmont, USA: Wadsworth International Group, 1984
[6] Wang X Z, Chen B, Qian G L, et al. On the Optimization of Fuzzy Decision Trees. Fuzzy Sets and Systems, 2000, 112(1): 117-125
[7] Wang X Z, Yeung D S. Tsang E C C. A Comparative Study on Heuristic Algorithms for Generating Fuzzy Decision Trees. IEEE Trans on Systems, Man and Cybernetics, 2001, 31(2): 215-226
[8] Wang X Z, Hong J R. On the Handling of Fuzziness for Continuous-Valued Attributes in Decision Tree Generation. Fuzzy Sets and Systems, 1998, 99(3): 283-290
[9] Tsang E C C, Wang X Z, Yeung D S. Improving Learning Accuracy of Fuzzy Decision Trees by Hybrid Neural Networks. IEEE Trans on Fuzzy Systems, 2000, 8(5): 601-614
[10] Shiu S C K, Sun C H, Wang X Z, et al. Maintaining Case-Based Reasoning Systems Using Fuzzy Decision Trees // Blanzieri E, Portinale L, eds. Lecture Notes in Artificial Intelligence. Berlin, Germany: Springer-Verlag. 2000, 1898: 285-296
[11] Wang X Z, Yang C X. Merging-Branches Impact on Decision Tree Induction. Chinese Journal of Computers, 2007, 30(8): 1251-1258 (in Chinese)
(王熙照,杨晨晓.分支合并对决策树归纳学习的影响.计算机学报, 2007, 30(8): 1251-1258)
[12] Yang C X, Wang X Z, Zhu R X. A Strategy of Merging Branches Based on Margin Enlargement of SVM in Decision Tree Induction // Proc of the IEEE International Conference on Systems, Man and Cybernetics. Taipei, China, 2006, I: 824-828
[13] Wang X Z, Zhao M H, Wang D H. Selection of Parameters in Building Fuzzy Decision Trees // Gedeon T D, Fung L C C, eds. Lecture Notes in Artificial Intelligence. Berlin, Germany: Springer-Verlag, 2003, 2903: 282-292
[14] Lee J W T, Wang X Z, Wang J F. Reduction of Attributes in Ordinal Decision Systems // Yeung D S, ed. Lecture Notes in Artificial Intelligence. Berlin, Germany: Springer-Verlag, 2006, 3930: 578-587
[15] Zopounidis C, Doumpos M. Multicriteria Classification and Sorting Methods: A Literature Review. European Journal of Operational Research, 2002, 138(2): 229-246
[16] Ben-David A, Sterling L, Pao Y H. Learning and Classification of Monotonic Ordinal Concepts. Computational Intelligence, 1989, 5(1): 45-49
[17] Potharst R, Bioch J C. Decision Trees for Ordinal Classification. Intelligent Data Analysis, 2000, 4(2): 97-111
[18] Potharst R. Feelders A J. Classification Trees for Problems with Monotonicity Constrains. SIGKDD Explorations, 2002, 4(1): 1-10
[19] Cao-Van K, Baets B D. Growing Decision Trees in an Ordinal Se-tting. International Journal of Intelligent Systems, 2003, 18(7): 733-750
[20] Barile N, Feelders A J. Nonparametric Monotone Classification with MOCA // Proc of the 8th IEEE International Conference on Data Mining. Pisa, Italy, 2008: 731-736
[21] Kotlowski W, Slowinski R. Rule Learning with Monotonicity Constrains // Proc of the 26th Annual International Conference on Machine Learning. Montreal, Canada, 2009: 537-544
[22] Hu Q H, Guo M Z, Yu D R, et al. Information Entropy for Ordinal Classification. Science China: Information Sciences, 2010, 53(6): 1188-1200
[23] Hu Q H, Che X J, Zhang L, et al. Rank Entropy-Based Decision Trees for Monotonic Classification. IEEE Trans on Knowledge and Data Engineering, 2012, 24(11): 2052-2064

基金

国家自然科学基金(No.61170040)、河北省自然科学基金(No.F2013201110,F2013201220)资助项目
PDF(444 KB)

1075

Accesses

0

Citation

Detail

段落导航
相关文章

/