基于伪孪生网络双层优化的对比学习

doi:10.16451/j.cnki.issn1003-6059.202210006

摘要
图/表
参考文献
相关文章 (5)

全文: PDF (2041 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要目前,基于伪孪生网络的对比学习算法使用各种组件以获得最优学生网络,但忽略教师网络在下游任务中的表现,因此,文中提出基于伪孪生网络双层优化的对比学习,促进学生网络和教师网络相互学习,获得最优教师网络.双层优化策略包括基于近邻优化的学生网络优化策略和基于随机梯度下降的教师网络优化策略.基于近邻优化的学生网络优化策略让教师网络成为约束项,帮助学生网络更好地向教师网络学习.基于随机梯度下降的教师网络优化策略求解近似教师网络,梯度更新教师网络.在5个数据集上的实验表明,文中算法取得较高的k-NN(k=1)分类精度和线性分类精度,特别在批次大小较小时,优势较大.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	陈庆宇
	季繁繁
	袁晓彤

关键词 ：对比学习, 双层优化, 学生网络, 近邻优化, 教师网络, 随机梯度下降

Abstract：At present, various designs are applied in contrastive learning algorithms based on pseudo siamese networks to acquire the best student network. However, the performance of teacher network in downstream tasks is ignored. Therefore, an algorithm of contrastive learning based on bilevel optimization of pseudo siamese networks(CLBO)is proposed to acquire the best teacher network by promoting the learning between student and teacher networks. The bilevel optimization strategy includes student network optimization strategy based on nearest neighbor optimization and teacher network optimization strategy based on stochastic gradient descent. The teacher network is regarded as a constraint term through the student network optimization strategy based on nearest neighbor optimization to help the student network learn better from the teacher network. The parameters are calculated by the teacher network optimization strategy based on stochastic gradient descent to update the teacher network. Experiments on 5 datasets show that CLBO performs better than other algorithms in k-NN classification and linear classification tasks. Especially, the advantages of CLBO is obvious when the batch size is smaller.

Key words： Contrastive Learning Bilevel Optimization Student Network Nearest Neighbor Optimization Teacher Network Stochastic Gradient Descent

收稿日期: 2022-04-29

ZTFLH:

TP 391

基金资助:科技创新2030-“新一代人工智能”重大项目(No.2018AAA0100400)、国家自然科学基金项目(No.U21B2049,61876090,61936005)资助

通讯作者: 袁晓彤,博士,教授,主要研究方向为机器学习、图像处理、AI+气象等.E-mail:xtyuan1980@gmail.com.

作者简介: 陈庆宇,硕士研究生,主要研究方向为自监督学习.E-mail:qychen1996@gmail.com. 季繁繁,博士研究生,主要研究方向为深度学习.E-mail:nuistji@gmail.com.

引用本文:

陈庆宇, 季繁繁, 袁晓彤. 基于伪孪生网络双层优化的对比学习[J]. 模式识别与人工智能, 2022, 35(10): 928-938. CHEN Qingyu, JI Fanfan, YUAN Xiaotong. Contrastive Learning Based on Bilevel Optimization of Pseudo Siamese Networks. Pattern Recognition and Artificial Intelligence, 2022, 35(10): 928-938.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202210006 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2022/V35/I10/928

[1] LECUN Y, BENGIO Y, HINTON G. Deep Learning. Nature, 2015, 521(7553): 436-444.
[2] WEI X S, LUO J H, WU J X, et al. Selective Convolutional Des-criptor Aggregation for Fine-Grained Image Retrieval. IEEE Transactions on Image Processing, 2017, 26(6): 2868-2881.
[3] BACHMAN P, HJELM R D, BUCHWALTER W. Learning Representations by Maximizing Mutual Information Across Views// Proc of the 33rd International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: MIT Press, 2019: 15535-15545.
[4] BENGIO Y, COURVILLE A, VINCENT P.Representation Lear-ning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828.
[5] IIZUKA S, SIMO-SERRA E, ISHIKAWA H.Globally and Locally Consistent Image Completion. ACM Transactions on Graphics, 2017, 36(4). DOI: 10.1145/3072959.3073659.
[6] 李仲年,张涛,张道强.基于自监督边缘融合网络的MRI影像重建.模式识别与人工智能, 2021, 34(4): 361-366.
(LI Z N, ZHANG T, ZHANG D Q.Self-Supervised Edge-Fusion Network for MRI Reconstruction. Pattern Recognition and Artificial Intelligence, 2021, 34(4): 361-366.)
[7] LI F F, QIAO H, ZHANG B.Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders. Pattern Recognition, 2018, 83: 161-173.
[8] ZHANG R, ISOLA P, EFROS A A.Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction// Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 645-654.
[9] CHEN L, BENTLEY P, MORI K, et al. Self-Supervised Learning for Medical Image Analysis Using Image Context Restoration. Medical Image Analysis, 2019, 58. DOI: 10.1016/j.media.2019.101539.
[10] DOSOVITSKIY A, FISCHER P, SPRINGENBERG J T, et al. Dis-criminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(9): 1734-1747.
[11] GIDARIS S, BURSUC A, KOMODAKIS N, et al. Boosting Few-Shot Visual Learning with Self-Supervision// Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 8058-8067.
[12] ZHAI X H, OLIVER A, KOLESNIKOV A, et al. S4L: Self-Supervised Semi-Supervised Learning// Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1476-1485.
[13] CHEN T, KORNBLITH S, NOROUZI M, et al. A Simple Framework for Contrastive Learning of Visual Representations// Proc of the 37th International Conference on Machine Learning. New York, USA: ACM, 2020: 1597-1607.
[14] CHEN T, KORNBLITH S, SWERSKY K, et al. Big Self-Super-vised Models Are Strong Semi-Supervised Learners[C/OL].[2022-04-25]. https://arxiv.org/pdf/2006.10029.pdf.
[15] HE K M, FAN H Q, WU Y X, et al. Momentum Contrast for Unsupervised Visual Representation Learning// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 9726-9735.
[16] CHEN X L, FAN H Q, GIRSHICK R, et al. Improved Baselines with Momentum Contrastive Learning[C/OL].[2022-04-25]. https://arxiv.org/pdf/2003.04297.pdf.
[17] CHOPRA S, HADSELL R, LECUN Y.Learning a Similarity Me-tric Discriminatively, with Application to Face Verification// Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2005: 539-546.
[18] WU Z R, XIONG Y J, YU S X, et al. Unsupervised Feature Learning via Non-Parametric Instance Discrimination// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 3733-3742.
[19] VAN DEN OORD A, LI Y Z, VINYALS O. Representation Lear-ning with Contrastive Predictive Coding[C/OL].[2022-04-25]. https://arxiv.org/pdf/1807.03748v1.pdf.
[20] TIAN Y L, KRISHNAN D, ISOLA P.Contrastive Multiview Co-ding// Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 776-794.
[21] BISHOP C M.Neural Networks for Pattern Recognition. Oxford, USA: Oxford University Press, 1995.
[22] GRILL J B, STRUB F, ALTCHÉ F, et al. Bootstrap Your Own Latent a New Approach to Self-Supervised Learning// Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 21271-21284.
[23] XIE Z D, LIN Y T, YAO Z L, et al. Self-Supervised Learning with Swin Transformers[C/OL].[2022-04-25]. https://arxiv.org/pdf/2105.04553.pdf.
[24] CHEN X L, HE K M.Exploring Simple Siamese Representation Learning// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 15745-15753.
[25] ZBONTAR J, LI J, MISRA I, et al. BARLOW TWINS: Self-Supervised Learning via Redundancy Reduction[C/OL].[2022-04-25]. https://arxiv.org/pdf/2103.03230.pdf.
[26] BARDES A, PONCE J, LECUN Y.VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning[C/OL]. [2022-04-25].https://arxiv.org/pdf/2105.04906v3.pdf.
[27] PHAM H, DAI Z H, XIE Q Z, et al. Meta Pseudo Labels// Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2021: 11552-11563.
[28] ZHOU P, YUAN X T, XU H, et al. Efficient Meta Learning via Minibatch Proximal Update// Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 1534-1544.
[29] RAJESWARAN A, FINN C, KAKADE S M, et al. Meta-Learning with Implicit Gradients[C/OL].[2022-04-25]. https://arxiv.org/pdf/1909.04630.pdf.
[30] ZHANG M R, LUCAS J, HINTON G, et al. Lookahead Optimizer: k Steps Forward, 1 Step Back[C/OL].[2022-04-25]. https://arxiv.org/pdf/1907.08610.pdf.
[31] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition// Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[32] KRIZHEVSKY A.Learning Multiple Layers of Features from Tiny Images. Technical Report, TR-2009. Toronto, Canada: University of Toronto, 2009.
[33] COATES A, LEE H, NG A Y.An Analysis of Single-Layer Networks in Unsupervised Feature Learning// Proc of the 14th International Conference on Artificial Intelligence and Statistics. San Diego, USA: JMLR, 2011: 215-223.
[34] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 2015, 115: 211-252.
[35] VAN DER MAATEN L, HINTON G. Visualizing Data Using t-SNE. Journal of Machine Learning Research, 2008, 9: 2579-2605.