基于指数移动平均知识蒸馏的神经网络低比特量化方法

doi:10.16451/j.cnki.issn1003-6059.202112007

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (681 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要目前存储和计算成本严重阻碍深度神经网络应用和推广,而神经网络量化是一种有效的压缩方法.神经网络低比特量化存在的显著困难是量化比特数越低,网络分类精度也越低.为了解决这一问题,文中提出基于指数移动平均知识蒸馏的神经网络低比特量化方法.首先利用少量图像进行自适应初始化,训练激活和权重的量化步长,加快量化网络收敛.再引入指数移动平均(EMA)知识蒸馏的思想,利用EMA对蒸馏损失和任务损失进行归一化,指导量化网络训练.在ImageNet、CIFAR-10数据集上的分类任务表明,文中方法可获得接近或超过全精度网络的性能.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ：深度学习, 网络量化, 知识蒸馏, 模型压缩

Abstract：Now the memory and computational cost restrict the popularization of deep neural network application, whereas neural network quantization is an effective compression method. As the number of quantized bits is lower, the classification accuracy of neural networks becomes poorer in low-bit quantization of neural networks. To solve this problem, a low-bit quantization method of neural networks based on knowledge distillation is proposed. Firstly, a few images are exploited for adaptive initialization to train the quantization step of activation and weight to speed up the convergence of the quantization network. Then, the idea of exponential moving average knowledge distillation is introduced to normalize distillation loss and task loss and guide the training of quantization network. Experiments on ImageNet and CIFAR-10 datasets show that the performance of the proposed method is close to or better than that of the full precision network.

Key words： Deep Learning Network Quantization Knowledge Distillation Model Compression

收稿日期: 2021-04-20

ZTFLH:

TP 391

基金资助:国家重点研发计划项目(No.2019YFB2204200)、中央高校基本科研业务费项目(No.2020JBM020)、北京市自然科学基金项目(No.4202063)资助

通讯作者: 王东,博士,教授,主要研究方向为图像处理、模型压缩、芯片设计等.E-mail:wangdong@bjtu.edu.cn.

作者简介: 吕君环,硕士研究生,主要研究方向为神经网络量化、模型压缩等.E-mail:lvjunhuan@bjtu.edu.cn.
许柯,博士研究生,主要研究方向为神经网络模型剪枝、量化、模型压缩等.E-mail:xuke225@bjtu.edu.cn.

引用本文:

吕君环, 许柯, 王东. 基于指数移动平均知识蒸馏的神经网络低比特量化方法[J]. 模式识别与人工智能, 2021, 34(12): 1143-1151. LÜ Junhuan, XU Ke, WANG Dong. Low-Bit Quantization of Neural Network Based on Exponential Moving Average Knowledge Distillation. , 2021, 34(12): 1143-1151.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202112007 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2021/V34/I12/1143