|
|
Comparison of Constructions for Hierarchical Structure Based on Confusion Matrix |
XIONG YunBo, LI RongLu, HU YunFa |
Department of Computing and Information Technology, Fudan University, Shanghai 200433 |
|
|
Abstract The hierarchical clustering and confusion classification are used to construct a documenttype hierarchical structure based on confusion matrix. The experimental results using hierarchical classification show that the performance of confusion classification excels that of hierarchical clustering, and the confusion classification improves the precision and recall of flat document classifier.
|
Received: 26 December 2005
|
|
|
|
|
[1] Yuan Shijin, Li Ronglu, Zhou Shuigeng, et al. Hierarchical Chinese Document Categorization.Journal of China Institute of Communications, 2004, 25(11): 5563 (in Chinese) (袁时金,李荣陆,周水庚,等.层次化中文文档分类.通信学报, 2004, 25(11): 5563) [2] Zhan Xuegang, Lin Hongfei, Yao Tianshun. Hierarchical Method for Chinese Document Classification. Journal of Chinese Information Processing, 1999, 13(6): 2025 (in Chinese) (战学刚,林鸿飞,姚天顺.中文文献的层次分类方法.中文信息学报, 1999, 13(6): 2025) [3] McCallum A, Rosenfeld R, Mitchell T, et al. Improving Text Classification by Shrinkage in a Hierarchy of Classes //Proc of the 15th International Conference on Machine Learning. Madison, USA,1998: 359367 [4] Koller D, Sahami M. Hierarchically Classifying Documents Using Very Few Words // Proc of the 14th International Conference on Machine Learning. Nashville, USA, 1997: 170178 [5] Ruiz M E, Srinivasan P. Hierarchical Neural Networks for Text Categorization // Proc of the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval. Berkeley, USA, 1999: 281282 [6] Dumais S T, Chen H. Hierarchical Classification of Web Content // Belkin N J, Ingwersen P, Leong M K, eds. Proc of the 23rd ACM International Conference on Research and Development in Information Retrieval. Athens, Greece, 2000: 256263 [7] Chakrabarti S, Dom B, Agrawal R, et al. Scalable Feature Selection, Classification and Signature Generation for Organizing Large Text Databases into Hierarchical Topic Taxonomies. International Journal on Very Large Data Bases, 1998, 7(3): 163178 [8] Griffiths A, Robinson L A, Willett P. Hierarchic Agglomerative Clustering Methods for Automatic Document Classification. Journal of Documentation, 1984, 40(3): 175205 [9] Wan Hao, Ren Yong, Shan Xiuming. ConfusionMatrix Based WholeAspectRange HRRP Recognition. Microelectronics & Computer, 2005, 22(3): 136143 (in Chinese) (万 昊,任 勇,山秀明.基于混淆矩阵的全方位角雷达目标识别.微电子与计算机, 2005, 22(3): 136143) [10] Zhang Jialu, Qi Shiqian, Yu Ge. Assessment Methods of Speech Synthesis Systems for Chinese, Acta Acustica, 1998, 23(1): 1930 (in Chinese) (张家騄,齐士钤,俞 舸.汉语语音合成系统评价方法.声学学报, 1998, 23(1): 1930) [11] Godbole S, Sarawagi S, Chakrabarti S. Scaling MultiClass Support Vector Machines Using InterClass Confusion // Proc of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, 2002: 513518 [12] Zhang Jing, Song Rui, Yu WenXian, et al. Construction of Hierarchical Classifiers Based on the Confusion Matrix and Fisher’s Principle. Journal of Software, 2005, 16(9):15601567 (in Chinese) (张 静,宋 锐,郁文贤,等.基于混淆矩阵和Fisher准则构造层次化分类器. 软件学报, 2005, 16(9):15601567) |
|
|
|