基于AdaBelief的Heavy-Ball动量方法

doi:10.16451/j.cnki.issn1003-6059.202202002

摘要
图/表
参考文献
相关文章 (11)

全文: PDF (791 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要同时使用动量和自适应步长技巧的自适应矩估计(Adaptive Moment Estimation, Adam)型算法广泛应用于深度学习中.针对此方法不能同时在理论和实验上达到最优这一问题,文中结合AdaBelief灵活调整步长提高实验性能的技巧,以及仅采用指数移动平均(Exponential Moving Average, EMA)策略调整步长的Heavy-Ball动量方法加速收敛的优点,提出基于AdaBelief的Heavy-Ball动量方法.借鉴AdaBelief和Heavy-Ball动量方法收敛性分析的技巧,巧妙选取时变步长、动量系数,并利用添加动量项和自适应矩阵的方法,证明文中方法对于非光滑一般凸优化问题具有最优的个体收敛速率.最后,在凸优化问题和深度神经网络上的实验验证理论分析的正确性,并且证实文中方法可在理论上达到最优收敛性的同时提高性能.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张泽东
	陇盛
	鲍蕾
	陶卿

关键词 ： AdaBelief, Heavy-Ball动量方法, 个体收敛速率, 深度神经网络

Abstract：Adaptive moment estimation algorithms with momentum and adaptive step techniques are widely applied in deep learning. However, these algorithms cannot achieve the optimal performance in both theory and experiment. To solve the problem, an AdaBelief based heavy-ball momentum method, AdaBHB, is proposed. The AdaBelief technique of adjusting step size flexibly is introduced to improve the algorithm performance in experiments. The heavy ball momentum method with step size adjusted by exponential moving average strategy is employed to accelerate convergence. According to the convergence analysis techniques of AdaBelief and Heavy-ball momentum methods, time-varying step size and momentum coefficient are selected skillfully and the momentum term and adaptive matrix are added. It is proved that AdaBHB gains the optimal individual convergence rate for non-smooth general convex optimization problems. Finally, the correctness of the theoretical analysis of the proposed algorithm is verified by experiments on convex optimization problems and deep neural networks, and AdaBHB is validated to obtain the optimal convergence in theory with performance improved.

Key words： AdaBelief Heavy-Ball Momentum Method Individual Convergence Rate Deep Neural Network

收稿日期: 2021-05-24

ZTFLH:

TP 181

基金资助:国家自然科学基金项目(No.62076252)资助

通讯作者: 陶卿,博士,教授,主要研究方向为模式识别、机器学习、应用数学.E-mail:qing.tao@ia.ac.cn.

作者简介: 张泽东,硕士研究生,主要研究方向为模式识别、机器学习.E-mail:1632783823@qq.com.陇盛,硕士研究生,主要研究方向为模式识别、机器学习.E-mail:ls15186322349@163.com.鲍蕾,博士,讲师,主要研究方向为模式识别、计算机视觉.E-mail:baolei1219@sina.cn.

引用本文:

张泽东, 陇盛, 鲍蕾, 陶卿. 基于AdaBelief的Heavy-Ball动量方法[J]. 模式识别与人工智能, 2022, 35(2): 106-115. ZHANG Zedong, LONG Sheng, BAO Lei, TAO Qing. AdaBelief Based Heavy-Ball Momentum Method. Pattern Recognition and Artificial Intelligence, 2022, 35(2): 106-115.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202202002 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2022/V35/I2/106