|
|
|
| Deep Contrastive Multi-view Clustering with Transformer Fusion |
| LI Shunyong1,2, YUAN Zhiying1, ZHAO Xingwang3,4 |
1. School of Mathematics and Statistics, Shanxi University, Tai-yuan 030006; 2. Key Laboratory of Complex Systems and Data Science of Mi-nistry of Education, Shanxi University, Taiyuan 030006; 3. School of Computer and Information Technology, Shanxi University, Taiyuan 030006; 4. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006 |
|
|
|
|
Abstract As an important task of unsupervised learning, multi-view clustering is designed to fuse heterogeneous view information to mine a consistent clustering structure. In the existing methods, the low-level features extracted by autoencoders lack cross-view semantic consistency, and simple fusion strategies lack dynamic assessment of view quality. Additionally, there is an absence of multi-level contrast constraints and local-global label alignment mechanisms. To address these issues, a deep contrastive multi-view clustering algorithm with Transformer fusion(DCMCTF) is proposed. First, cross-view alignment of low-level feature distributions is achieved under an alternating adversarial learning mechanism, and then instance-level and cluster-level dual contrastive learning mechanisms are introduced to enhance cross-view consistency and feature discriminative ability. Second, a Transformer adaptive fusion module is leveraged to dynamically learn view relationships. Robust consensus representations are generated by combining quality-aware scoring, and the global labels obtained from consensus representations are aligned with local labels of specific views. Experiments on 9 datasets demonstrate that DCMCTF achieves excellent clustering performance.
|
|
Received: 17 November 2025
|
|
|
| Fund:National Natural Science Foundation of China(No.82274360), Program for the Scientific Activities of Selected Returned Overseas Professionals in Shanxi Province(No.20250001), Fundamental Research Program of Shanxi Province(No.202303021221054,202403021211086), Research Project of Shanxi Scholarship Council of China(No.2024-002), Graduate Education Innovation Program of Shanxi Province(No.2025JG0006,2025SJ032) |
|
Corresponding Authors:
ZHAO Xingwang, Ph.D., professor. His research interests include data mining and machine learning.
|
About author:: LI Shunyong, Ph.D., professor. His research interests include statistical machine lear-ning and big data analytics techniques. YUAN Zhinying, Master student. Her research interests include statistical machine learning. |
|
|
|
[1] REDDY M G,REDDY P V N, REDDY P R. Multi-modal Medical Image Fusion Using 3-Stage Multiscale Decomposition and PCNN with Adaptive Arguments. International Journal of Image and Gra-phics, 2023, 23(3). DOI: 10.1142/S0219467822400101. [2] WANG Y.Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry, and Fusion. ACM Transactions on Multimedia Computing, Communications, and Applications, 2021, 17(1s). DOI: 10.1145/3408317. [3] HUANG D, WANG C D, LAI J H.Fast Multi-view Clustering via Ensembles: Towards Scalability, Superiority, and Simplicity. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(11): 11388-11402. [4] QIN Y L, FENG G R, REN Y L, et al. Consistency-Induced Multiview Subspace Clustering. IEEE Transactions on Cybernetics, 2022, 53(2): 832-844. [5] 赵兴旺,王淑君,刘晓琳,等.基于二部图的联合谱嵌入多视图聚类算法.软件学报, 2023, 35(9): 4408-4424. (ZHAO X W, WANG S J, LIU X L, et al. Joint Spectral Embe-dding Multi-view Clustering Algorithm Based on Bipartite Graphs. Journal of Software, 2023, 35(9): 4408-4424.) [6] LI Z Y, WANG Q Q, TAO Z Q, et al. Deep Adversarial Multi-view Clustering Network // Proc of the 28th International Joint Confe-rence on Artificial Intelligence. San Francisco, USA: IJCAI, 2019: 2952-2958. [7] YAN W Q, YANG T Y, TANG C.Self-Supervised Semantic Soft Label Learning Network for Deep Multi-view Clustering. IEEE Transactions on Multimedia, 2025, 27: 4971-4983. [8] WANG J, WU B, REN Z W, et al. Decomposed Deep Multi-view Subspace Clustering with Self-Labeling Supervision. Information Sciences, 2024, 653. DOI: 10.1016/j.ins.2023.119798. [9] 王静红,陈潇,王熙照,等.基于自适应结构增强的对比协同多视图属性图聚类.模式识别与人工智能, 2025, 38(9): 809-819. (WANG J H, CHEN X, WANG X Z, et al. Contrastive Collaborative Multi-view Attribute Graph Clustering Based on Adaptive Structure Enhancement. Pattern Recognition and Artificial Intelligence, 2025, 38(9): 809-819.) [10] LI D, WANG H B, WANG Y F, et al. Instance-Wise Multi-view Representation Learning. Information Fusion, 2023, 91: 612-622. [11] WANG J, FENG S H, LÜ G Y, et al. Triple-Granularity Contrastive Learning for Deep Multi-view Subspace Clustering // Proc of the 31st ACM International Conference on Multimedia. New York, USA: ACM, 2023: 2994-3002. [12] YAN W Q, ZHANG Y Y, TANG C, et al. Anchor-Sharing and Cluster-Wise Contrastive Network for Multiview Representation Learning. IEEE Transactions on Neural Networks and Learning Systems, 2024, 36(2): 3797-3807. [13] CUI J R, LI Y T, HUANG H, et al. Dual Contrast-Driven Deep Multi-view Clustering. IEEE Transactions on Image Processing, 2024, 33: 4753-4764. [14] ZHU P F, YAO X J, WANG Y, et al. Multiview Deep Subspace Clustering Networks. IEEE Transactions on Cybernetics, 2024, 54(7): 4280-4293. [15] LIU J, CAO F Y, JING X C, et al. Deep Multi-view Graph Clustering Network with Weighting Mechanism and Collaborative Training. Expert Systems with Applications, 2024, 236. DOI: 10.1016/j.eswa.2023.121298. [16] CHEN Z, WU X J, XU T Y, et al. Multi-layer Multi-level Comprehensive Learning for Deep Multi-view Clustering. Information Fusion, 2025, 116. DOI: 10.1016/j.inffus.2024.102785. [17] CHEN X L, HE K M.Exploring Simple Siamese Representation Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 15745-15753. [18] CHEN T, KORNBLITH S, NOROUZI M, et al. A Simple Framework for Contrastive Learning of Visual Representations // Proc of the 37th International Conference on Machine Learning. San Diego, USA: JMLR, 2020: 1597-1607. [19] HE K M, FAN H Q, WU Y X, et al. Momentum Contrast for Unsupervised Visual Representation Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 9726-9735. [20] 张凯涵,冯晨娇,姚凯旋,等.基于对比学习和语义增强的多模态推荐算法.模式识别与人工智能, 2024, 37(6): 479-490. (ZHANG K H, FENG C J, YAO K X, et al. Multimodal Reco-mmendation Algorithm Based on Contrastive Learning and Semantic Enhancement. Pattern Recognition and Artificial Intelligence, 2024, 37(6): 479-490.) [21] BIAN J T, LIN Y X, XIE X H, et al. Multilevel Contrastive Multiview Clustering with Dual Self-Supervised Learning. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(6): 10422-10436. [22] FEI L K, HE J L, ZHU Q, et al. Deep Multi-view Contrastive Clustering via Graph Structure Awareness. IEEE Transactions on Image Processing, 2025, 34: 3805-3816. [23] CHENG J F, WANG Q Q, TAO Z Q, et al. Multi-view Attribute Graph Convolution Networks for Clustering // Proc of the 29th International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2020: 2973-2979. [24] CAI X, WANG H, HUANG H, et al. Joint Stage Recognition and Anatomical Annotation of Drosophila Gene Expression Patterns. Bioinformatics, 2012, 28(12): i16-i24. [25] CHEN M S, LIN J Q, LI X L, et al. Representation Learning in Multi-view Clustering: A Literature Review. Data Science and Engineering, 2022, 7(3): 225-241. [26] PENG X, HUANG Z Y, LÜ J C, et al. COMIC: Multi-view Clustering without Parameter Selection // Proc of the 36th International Conference on Machine Learning. San Diego, USA: JMLR, 2019: 5092-5101. [27] KUMAR A, RAI P, DAUMÉ H.Co-regularized Multi-view Spectral Clustering[C/OL]. [2025-10-25].https://proceedings.neurips.cc/paper/2011/file/31839b036f63806cba3f47b93af8ccb5-Paper.pdf. [28] XIAO H, RASUL K, VOLLGRAF R.Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms[C/OL]. [2025-10-25].https://arxiv.org/abs/1708.07747. [29] KRIZHEVSKY A.Learning Multiple Layers of Features from Tiny Images. Technical Report. Toronto, Canada: University of Toronto, 2009. [30] LI F F, FERGUS R, PERONA P.Learning Generative Visual Mo-dels from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories // Proc of the Conference on Computer Vision and Pattern Recognition Workshop. Washington, USA: IEEE, 2004. DOI: 10.1109/CVPR.2004.383. [31] WANG D, HAN S W, WANG Q, et al. Pseudo-Label Guided Co-llective Matrix Factorization for Multiview Clustering. IEEE Transactions on Cybernetics, 2022, 52(9): 8681-8691. [32] KANG Z, ZHOU W T, ZHAO Z T, et al. Large-Scale Multi-view Subspace Clustering in Linear Time. Proc of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 4412-4419. [33] TROSTEN D J, LØKSE S, JENSSEN R, et al. Reconsidering Re-presentation Alignment for Multi-view Clustering // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 1255-1265. [34] KE G Z, HONG Z Y, ZENG Z Q, et al. CONAN: Contrastive Fusion Networks for Multi-view Clustering // Proc of the IEEE International Conference on Big Data. Washington, USA: IEEE, 2021: 653-660. [35] XU J, REN Y Z, LI G F, et al. Deep Embedded Multi-view Clustering with Collaborative Training. Information Sciences, 2021, 573: 279-290. [36] YAN W Q, ZHANG Y Y, LÜ C L, et al. GCFAgg: Global and Cross-View Feature Aggregation for Multi-view Clustering // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2023: 19863-19872. [37] XU J, TANG H Y, REN Y Z, et al. Multi-level Feature Learning for Contrastive Multi-view Clustering // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 16030-16039. [38] YAN W B, ZHOU Y Y, WANG Y F, et al. Multi-view Semantic Consistency Based Information Bottleneck for Clustering. Know-ledge-Based Systems, 2024, 288. DOI: 10.1016/j.knosys.2024.111448. [39] BIAN J T, XIE X H, LAI J H, et al. Multi-view Contrastive Clustering via Integrating Graph Aggregation and Confidence Enhancement. Information Fusion, 2024, 108. DOI: 10.1016/j.inffus.2024.102393. [40] ZHANG Y Y, YAN W Q, TANG C, et al. Multi-branch Space Sharing Feature Aggregation for Contrastive Multi-view Clustering. Pattern Recognition, 2025. DOI: 10.1016/j.patcog.2025.111704. [41] VAN DER MAATEN L, HINTON G. Visualizing Data Using t-SNE. Journal of Machine Learning Research, 2008, 9(86): 2579-2605. |
|
|
|