基于无监督表征学习的深度聚类研究进展

doi:10.16451/j.cnki.issn1003-6059.202211005

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (865 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在大数据时代,数据通常具有规模大、维度高、结构复杂的特点,深度聚类利用深度学习结合表征学习与聚类任务,大幅提高聚类在大规模高维数据中的性能.现有文献少有着重从表征学习的角度归纳和分析目前深度聚类的发展概况,也未通过实验分析传统聚类算法、深度聚类算法及不同深度聚类算法之间的差异.因此,文中首先基于无监督表征学习,简要整理深度聚类中常用的聚类算法,重点将深度聚类算法分成基于生成模型的深度聚类与基于判别模型的深度聚类,分析聚类任务中各深度模型的表征学习过程.然后,通过实验对比分析多类算法,归纳总结优缺点,便于开展针对具体任务中的算法选择.最后,为了深度聚类的进一步发展,描述其应用场景,并讨论未来的发展趋势.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	侯海薇
	丁世飞
	徐晓

关键词 ：神经网络, 表征学习, 深度聚类, 无监督学习, 损失函数

Abstract：In the era of big data, data usually has the characteristics of large scale, high dimension and complex structure. Deep learning is utilized to combine representation learning and clustering tasks in deep clustering. Therefore, the performance of deep clustering for large-scale and high-dimensional data is greatly improved. The development of deep clustering is rarely summarized from the perspective of representation learning. The difference between traditional and deep clustering algorithms and the heterogeneity of deep clustering algorithms are seldom analyzed. Firstly, common clustering algorithms in deep clustering are summarized. Deep clustering algorithms are divided into generative and discriminative models based deep clustering algorithms, and representation learning process of deep models in clustering tasks is analyzed. Secondly, the comparative analysis of multiple types of algorithms is carried out through experiments. And the advantages and disadvantages of different algorithms are summarized to select models for specific tasks. Finally, application scenarios are described and the future development trend of deep clustering is discussed.

Key words： Neural Network Representation Learning Deep Clustering Unsupervised Learning Loss Function

收稿日期: 2022-08-30

ZTFLH:

TP 311

基金资助:国家自然科学基金项目(No.61976216,62276265,61672522)资助

通讯作者: 丁世飞,博士,教授,主要研究方向为模式识别、机器学习、数据挖掘.E-mail:dingsf@cumt.edu.cn.

作者简介: 侯海薇,博士研究生,主要研究方向为机器学习、深度学习、深度聚类.E-mail:hou_haiwei@cumt.edu.cn.徐晓,博士,讲师,主要研究方向为机器学习、聚类分析.E-mail:xu_xiao@cumt.edu.cn.

引用本文:

侯海薇, 丁世飞, 徐晓. 基于无监督表征学习的深度聚类研究进展[J]. 模式识别与人工智能, 2022, 35(11): 999-1014. HOU Haiwei, DING Shifei, XU Xiao. Research Progress of Deep Clustering Based on Unsupervised Representation Learning. Pattern Recognition and Artificial Intelligence, 2022, 35(11): 999-1014.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202211005 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2022/V35/I11/999