模式识别与人工智能
Monday, Apr. 21, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2012, Vol. 25 Issue (6): 996-1001    DOI:
Orignal Article Current Issue| Next Issue| Archive| Adv Search |
A Text Clustering Method Based on Speech to Text and Improved Center Selection
SHI Kan-Sheng, LIU Hai-Tao, SONG Wen-Tao
School of Electronic Information and Electrical Engineering,Shanghai Jiaotong University,Shanghai 200040

Download: PDF (410 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  The traditional k-means algorithm is sensitive to the initial point and easy to fall into local optimum. An improved speech to text and improved center selection (STICS) based text clustering method is proposed. Taking into account the speech to text, the optimal selection of centers and treatment of outliers concurrently, STICS has three aspects of improvement. The weighted vector space model (VSM) is used to represent text according to the speech to text. For the selection of the center, the sample average similarity is measured for each sample, and the sample with the largest sample average similarity is selected as the first center. In addition, STICS method eliminates the negative influences of isolated points or outliers. Both theoretical analysis and experimental results prove that the proposed algorithm has better clustering results.
Key wordsText Clustering      k-means      Speech to Text      Sample Average Similarity      Outlier     
Received: 25 August 2011     
ZTFLH: TP3  
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
SHI Kan-Sheng
LIU Hai-Tao
SONG Wen-Tao
Cite this article:   
SHI Kan-Sheng,LIU Hai-Tao,SONG Wen-Tao. A Text Clustering Method Based on Speech to Text and Improved Center Selection[J]. , 2012, 25(6): 996-1001.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2012/V25/I6/996
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn