模式识别与人工智能
Friday, May. 2, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2017, Vol. 30 Issue (12): 1138-1148    DOI: 10.16451/j.cnki.issn1003-6059.201712010
Orignal Article Current Issue| Next Issue| Archive| Adv Search |
Semi-supervised Labeled Hierarchical Dirichlet Process Topic Model for Document Categorization
LI Yongzhong, ZHENG Tao
School of Economics and Management, Fuzhou University, Fuzhou 350116

Download: PDF (579 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  The optimal structure of theme set can be automatically learned from the data with Hierarchical Dirichlet Process(HDP) topic model. However, the set of topics can not meet the semantic requirement. And in some theme models with labels it is difficult to set the parameters. Therefore, based on the known semantic labels and the certitude degree of labels, a semi-supervised labeled HDP topic model(SLHDP) and the accuracy evaluation index of random cluster are proposed in this paper. Higher weight is given by the known semantic labels. Combined with the property of the finite space being divided infinitely in Dirichlet process, the model is built via Chinese restaurant process. The experimental results on several Chinese and English datasets show that SLHDP model makes the topic set more reasonable in the text classification of large scale datasets.
Key wordsLabel      Semi-supervised      Hierarchical Dirichlet Process(HDP)      Topic Model      Random Cluster     
Received: 29 December 2016     
ZTFLH: TP 391.1  
About author:: (LI Yongzhong, born in 1963, master, associate professor. His research interests include information management and electronic government.)
(ZHENG Tao(Corresponding auther), born in 1992, master student. His research interests include machine learning.)
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
LI Yongzhong
ZHENG Tao
Cite this article:   
LI Yongzhong,ZHENG Tao. Semi-supervised Labeled Hierarchical Dirichlet Process Topic Model for Document Categorization[J]. , 2017, 30(12): 1138-1148.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.201712010      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2017/V30/I12/1138
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn