模式识别与人工智能
Thursday, Apr. 3, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2009, Vol. 22 Issue (5): 780-786    DOI:
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Document Cluster Ensemble Algorithms Based on Matrix Spectral Analysis
XU Sen1, LU Zhi-Mao2, GU Guo-Chang1
1.College of Computer Science and Technology, Harbin Engineering University, Harbin 150001
2.College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001

Download: PDF (410 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Cluster ensemble techniques are effective in improving both the robustness and the stability of the single clustering algorithm. How to combine multiple clusters to yield a final superior clustering result is critical in cluster ensemble. Spectral clustering algorithm is introduced to solve document cluster ensemble problem. Normalized Laplacian matrix-based spectral algorithm (NLMSA) is proposed. According to algebraic transformation, it computes eigenvalues and eigenvectors of a small matrix to obtain the eigenvectors of normalized Laplacian matrix. The key idea of spectral clustering algorithm is further investigated, and hyperedge transition matrix-based spectral algorithm (HTMSA) is proposed. It attains the low dimensional embeddings of documents by those of hyperedges and then the K-means algorithm is used to cluster according to those embedding results of documents. Experimental results on TREC and Reuters document sets demonstrate the effectiveness of the proposed algorithms. Both NLMSA and HTMSA outperform other cluster ensemble techniques based on graph partitioning. NLMSA obtains better results than HTMSA while the computational cost of HTMSA is much lower than that of NLMSA.
Key wordsClustering Analysis      Cluster Ensemble      Spectral Clustering      Document Clustering      Low Rank Approximation of Matrix     
Received: 04 September 2008     
ZTFLH: TP391  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
XU Sen
LU Zhi-Mao
GU Guo-Chang
Cite this article:   
XU Sen,LU Zhi-Mao,GU Guo-Chang. Document Cluster Ensemble Algorithms Based on Matrix Spectral Analysis[J]. , 2009, 22(5): 780-786.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2009/V22/I5/780
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn