一种基于边界的贪心组合剪枝方法

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (454 KB) HTML (0 KB)
输出: BibTeX | EndNote (RIS)

摘要理论及实验表明，在训练集上具有较大边界分布的组合分类器泛化能力较强。文中将边界概念引入到组合剪枝中，并用它指导组合剪枝方法的设计。基于此，构造一个度量标准(MBM)用于评估基分类器相对于组合分类器的重要性，进而提出一种贪心组合选择方法(MBMEP)以降低组合分类器规模并提高它的分类准确率。在随机选择的30个UCI数据集上的实验表明，与其它一些高级的贪心组合选择算法相比，MBMEP选择出的子组合分类器具有更好的泛化能力。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ：组合剪枝, 边界, 向前选择, 向后剔除

Abstract：Theoretical and experimental results indicate that for the ensemble classifiers with the same training error the one with higher margin distribution on training examples has better generalization performance. Therefore,the concept of margins of examples is introduced to ensemble pruning and it is employed to supervise the design of ensemble pruning methods. Based on the margins,a new metric called margin based metric (MBM) is designed to evaluate the importance of a classifier to an ensemble and an example set,and then a greedy ensemble pruning method called MBM based ensemble selection is proposed to reduce the ensemble size and improve its accuracy. The experimental results on 30 UCI datasets show that compared with other state of the art greedy ensemble pruning methods,the ensembles selected by the proposed method have better performance.

Key words： Ensemble Pruning Margin Forward Selection Backward Elimination

收稿日期: 2012-02-21

ZTFLH:

TP181

基金资助:国家自然科学基金资助项目(No.60901078)

作者简介: 郭华平(通讯作者)，男，1982年生，博士研究生，主要研究方向为机器学习、数据挖掘.E-mail:hpguo.gm@gmail.com.范明，男，1948年生，教授，博士生导师，主要研究方向为机器学习、数据挖掘、数据库.职为梅，女，1977年生，讲师，主要研究方向为数据挖掘.

引用本文:

郭华平范明职为梅. 一种基于边界的贪心组合剪枝方法[J]. 模式识别与人工智能, 2013, 26(2): 136-143. GUO Hua Ping,FAN Ming,ZHI Wei Mei. A Margin Based Greedy Ensemble Pruning Method. , 2013, 26(2): 136-143.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/ 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2013/V26/I2/136

[1]Kuncheva L I. Combining Pattern Classifiers: Methods and Algorithms. New York,USA: John Wiley and Sons,2004
[2]Breiman L. Bagging Predictors. Machine Learning,1996,24(2): 123-140
[3]Freund Y,Schapire R F. A Decision Theoretic Generalization of On line Learning and an Application to Boosting. Journal of Computer and System Sciences,1997,55(1): 119-139
[4]Breiman L. Random Forests. Machine Learning,2001,45(1): 5-32
[5]Rodriguez J J,Kuncheva L I,Alonso C J. Rotation Forest: A New Classifier Ensemble Method. IEEE Trans on Pattern Analysis and Machine Intelligence,2006,28(10): 1619-1630
[6]Zhang Daoqiang,Chen Songcan,Zhou Zhihua,et al. Constraint Projections for Ensemble Learning // Proc of the 23rd AAAI Conference on Artificial Intelligence. Chicago,USA,2008: 758-763
[7]Partalas I,Tsoumakas G,Vlahavas I P. An Ensemble Pruning Primer \[EB/OL\]. \[2012-01-31\]. http://lpis. csd. auth. gr / publications / tsoumakas09. pdf
[8]Caruana R,Niculescu Mizil A,Crew G,et al. Ensemble Selection from Libraries of Models // Proc of the 21st International Conference on Machine Learning. Banff,Canada,2004: 137-144
[9]Martinez Muverbnoz G,Suarez A. Aggregation Ordering in Bagging // Proc of the International Conference on Artificial Intelligence and Applications. Innsbruck,Austria,2004: 258-263
[10]Martinez Muverbnoz G,Suarez A. Pruning in Ordered Bagging Ensembles // Proc of the 23rd International Conference on Machine Learning. Pittsburgh,USA,2006: 609-616
[11]Lu Zhenyu,Wu Xindong,Zhu Xingquan,et al. Ensemble Pruning via Individual Contribution Ordering // Proc of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Washington,USA,2010: 871-880
[12]Banfield R E,Hall L O,Bowyer K W,et al. Ensemble Diversity Measures and Their Application to Thinning. Information Fusion,2005,6(1): 49-62
[13]Partalas I,Tsoumakas G,Vlahavas I P. Focused Ensemble Selection: A Diversity Based Method for Greedy Ensemble Selection // Proc of the 18th European Conference on Artificial Intelligence. Patras,Greece,2008: 117-121
[14]Partalas I,Tsoumakas G,Vlahavas I P. An Ensemble Uncertainty Aware Measure for Directed Hill Climbing Ensemble Pruning. Machine Learning,2010,81(3): 257-282
[15]Schapire R E,Freund Y,Bartlett P,et al. Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods. The Annals of Statistics,1998,26(5): 1651-1686
[16]Garcia Pddrajas N,Garcia Osorio C,Fyfe C. Nonlinear Boosting Projections for Ensemble Construction. Journal of Machine Learning Research,2007,8(1): 1-33
[17]Quinlan J R. C4.5: Programs for Machine Learning. New York,USA: Morgan Kaufmann,1993
[18]Witten I H,Frank E. Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition. New York,USA: Morgan Kaufmann,2005
[19]Demsar J. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research,2006,7(1): 1-30