模式识别与人工智能
Thursday, Apr. 3, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence  2022, Vol. 35 Issue (4): 348-362    DOI: 10.16451/j.cnki.issn1003-6059.202204005
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Multi-observation I-nice Clustering Algorithm Based on Candidate Centers Fusion
CHEN Hongjie1,2, HE Yulin1,2, HUANG Zhexue1,2, YIN Jianfei1,2
1. Big Data Institute, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060;
2. National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060

Download: PDF (1947 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  

With the rapid growth of data scale and composition complexity in the real-world applications, it is an important challenge for current clustering algorithms to estimate the number and the centers of clusters accurately in processing and analyzing the complex and large-scale data. The accurate estimation of cluster number and cluster centers is crucial for partial parametric clustering algorithm, complexity measurement and simplified representation of dataset. In this paper, grounded on the in-depth analysis of I-nice, a multi-observation I-nice clustering algorithm based on candidate centers fusion(I-niceCF) is proposed. Based on the original multi-observation projection divide-and-conquer framework, Gaussian mixture model(GMM) is combined with the coarse-to-fine optimal mixture model search strategy to partition data subsets exactly. In addition, GMM component vectors of candidate centers are constructed based on the distance of candidate centers from each observation point and optimal GMMs. A Minkowski distance pair is designed to measure the dissimilarity between candidate centers. Finally, the candidate centers are fused based on the mixture component vectors. Different from the existing clustering algorithms, I-niceCF is jointly optimized by data subset partitioning of divide-and-conquer process and candidate centers fusion. Consequently, accurate and efficient estimation for hundreds of clusters is achieved. A series of experiments on real and synthetic datasets show that I-niceCF can estimate cluster number and cluster centers more accurately with higher clustering accuracy and its stability under various data scenarios is verified.

Key wordsUnsupervised Learning      Observation Point      I-nice      Parameter-Free Clustering      Gaussian Mixture Model     
Received: 25 October 2021     
ZTFLH: TP301  
Fund:

National Natural Science Foundation of China(No.61972261), Basic Research Foundations of Shenzhen(No.JCYJ20210324093609026,JCYJ20200813091134001)

Corresponding Authors: HE Yulin, Ph.D., associate professor. His research interests include data mining, machine learning, big data processing and analysis, big data approximate computing, and theories and methods of multi-sample statistics.   
About author:: CHEN Hongjie, master student. His research interests include data mining, big data processing and analysis, and big data system computing techno-logy. HUANG Zhexue, Ph.D., professor. His research interests include data mining, machine learning, big data processing and analysis, and big data system computing technology. YIN Jianfei, Ph.D., associate professor. His research interests include big data processing and analysis, machine learning, data mining and numerical optimization.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
CHEN Hongjie
HE Yulin
HUANG Zhexue
YIN Jianfei
Cite this article:   
CHEN Hongjie,HE Yulin,HUANG Zhexue等. Multi-observation I-nice Clustering Algorithm Based on Candidate Centers Fusion[J]. Pattern Recognition and Artificial Intelligence, 2022, 35(4): 348-362.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202204005      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2022/V35/I4/348
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn