模式识别与人工智能
Friday, Apr. 11, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
  2021, Vol. 34 Issue (12): 1143-1151    DOI: 10.16451/j.cnki.issn1003-6059.202112007
Researches and Applications Current Issue| Next Issue| Archive| Adv Search |
Low-Bit Quantization of Neural Network Based on Exponential Moving Average Knowledge Distillation
LÜ Junhuan1,2, XU Ke1,2, WANG Dong1,2
1. Institute of Information Science, Beijing Jiaotong University, Beijing 100044;
2. Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing Jiaotong University, Beijing 100044

Download: PDF (681 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  Now the memory and computational cost restrict the popularization of deep neural network application, whereas neural network quantization is an effective compression method. As the number of quantized bits is lower, the classification accuracy of neural networks becomes poorer in low-bit quantization of neural networks. To solve this problem, a low-bit quantization method of neural networks based on knowledge distillation is proposed. Firstly, a few images are exploited for adaptive initialization to train the quantization step of activation and weight to speed up the convergence of the quantization network. Then, the idea of exponential moving average knowledge distillation is introduced to normalize distillation loss and task loss and guide the training of quantization network. Experiments on ImageNet and CIFAR-10 datasets show that the performance of the proposed method is close to or better than that of the full precision network.
Key wordsDeep Learning      Network Quantization      Knowledge Distillation      Model Compression     
Received: 20 April 2021     
ZTFLH: TP 391  
Fund:National Key Research and Development Program of China(No.2019YFB2204200), Fundamental Research Funds for the Central Universities(No.2020JBM020), Beijing Natural Science Foundation Program(No.4202063)
Corresponding Authors: WANG Dong, Ph.D., professor. His research interests include image processing, model compression and chip design.   
About author:: LÜ Junhuan, master student. Her research interests include neural network quantization and model compression.
XU Ke, Ph.D. candidate. His research interests include neural network pruning, quan-tization and model compression.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
Cite this article:   
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202112007      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2021/V34/I12/1143
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn