Label-Free Network Pruning via Reinforcement Learning
LIU Huidong1, DU Fang1,2, YU Zhenhua1,2, SONG Lijuan1,2
1. School of Information Engineering, Ningxia University, Yinchuan 750021 2. Collaborative Innovation Center for Ningxia Big Data and Artificial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Ningxia University, Yinchuan 750021
Abstract:To remove redundant structures from deep neural networks and find a network structure with a good balance between capability and complexity, a label-free global compression learning method(LFGCL) is proposed. A global pruning strategy is learned based on the network architecture representation to effectively avoid the appearance of the suboptimal compression rate owing to network pruning in a layer-by-layer manner. LFGCL is independent from data labels during pruning, and the network architecture is optimized by outputting similar features with the baseline network. The deep deterministic policy gradient algorithm is applied to explore the optimal network structure by inferring the compression ratio of all layers through reinforcement learning. Experiments on multiple datasets show that LFGCL generates better performance.
[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks // Proc of the 25th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2012: 1097-1105. [2] KORTYLEWSKI A, HE J, LIU Q, et al. Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 8940-8949. [3] KIM I, BAEK W, KIM S, et al. Spatially Attentive Output Layer for Image Classification // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 9533-9542. [4] GUO C X, FAN B, ZHANG Q, et al. AugFPN: Improving Multi-scale Feature Learning for Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 12595-12604. [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [6] 林景栋,吴欣怡,柴 毅,等.卷积神经网络结构优化综述.自动化学报, 2020, 46(1): 24-37. (LIN J D, WU X Y, CHAI Y, et al. Structure Optimization of Convolutional Neural Networks: A Survey. Acta Automatica Sinica, 2020, 46(1): 24-37.) [7] HAN S, MAO H Z, DALLY W J. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding // Proc of the International Conference on Learning Representation. Washington, USA: IEEE, 2016: 3-7. [8] HAN S, POOL J, TRAN J, et al. Learning Both Weights and Connections for Efficient Neural Networks // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2015: 1135-1143. [9] LIN J, RAO Y M, LU J W, et al. Runtime Neural Pruning // Proc of the 31st International Conference on Neural Information Proce-ssing Systems. Cambridge, USA: The MIT Press, 2017: 2181-2191. [10] LIN S H, JI R R, YAN C Q, et al. Towards Optimal Structured CNN Pruning via Generative Adversarial Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2790-2799. [11] LIN M B, JI R R, WANG Y, et al. HRank: Filter Pruning Using High-Rank Feature Map // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 1529-1538. [12] MUSSAY B, OSADCHY M, BRAVERMAN V, et al. Data-Independent Neural Pruning via Coresets[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1907.04018.pdf. [13] HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [C/OL]. [2020-07-13]. https://arxiv.org/pdf/1704.04861.pdf. [14] ZHANG T, QI G J, XIAO B, et al. Interleaved Group Convolutions // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 4373-4382. [15] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 6848-6856. [16] BAKER B, GUPTA O, NAIK N, et al. Designing Neural Network Architectures Using Reinforcement Learning[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1611.02167.pdf. [17] BELLO I, ZOPH B, VASUDEVAN V, et al. Neural Optimizer Search with Reinforcement Learning[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1709.07417.pdf. [18] ASHOK A, RHINEHART N, BEAINY F, et al. N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1709.06030v1.pdf. [19] HE Y H, LIN J, LIU Z J, et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 784-800. [20] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous Control with Deep Reinforcement Learning[C/OL]. [2020-07-13]. https://arxiv.org /pdf/1509.02971.pdf. [21] CHIN T W, DING R Z, ZHANG C, et al. Towards Efficient Model Compression via Learned Global Ranking // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 1518-1528. [22] SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction. Cambridge, USA: The MIT Press, 1998. [23] DENTON E, ZAREMBA W, BRUNA J, et al. Exploiting Linear Structure within Convolutional Networks for Efficient Evaluation // Proc of the 27th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 1269-1277. [24] TAI C, XIAO T, ZHANG Y, et al. Convolutional Neural Networks with Low-Rank Regularization[C/OL]. [2020-07-13]. https://arxiv.org/abs/ 1511.06067.pdf. [25] CARUANA R, NICULESCU-MIZIL A. Model Compression // Proc of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2006: 535-541. [26] HINTON G, VINYALS O, DEAN J. Distilling the Knowledge in a Neural Network[C/OL]. [2020-07-13]. https://arxiv.org/abs/ 1503.02531.pdf. [27] DONG X Y, YANG Y. Network Pruning via Transformable Architecture Search // Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2019: 759-770. [28] JACOB B, KLIGYS S, CHEN B, et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2018: 2704-2713. [29] LI F F, ZHANG B, LIU B. Ternary Weight Networks[C/OL]. [2020-07-13]. https://arxiv.org/abs/1605.04711.pdf. [30] LI H, KADAV A, DURDANOVIC I, et al. Pruning Filters for Efficient ConvNets[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1608.08710.pdf. [31] KRIZHEVSKY A, HINTON G. Learning Multiple Layers of Features from Tiny Images. Master Dissertation. Toronto, Canada: University of Toronto, 2009. [32] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2020-07-13]. https://arxiv.org/pdf/1409.1556.pdf. [33] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [34] HUANG Z H, WANG N Y. Data-Driven Sparse Structure Selection for Deep Neural Networks // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 304-320. [35] HE Y, DING Y H, LIU P, et al. Learning Filter Pruning Criteria for Deep Convolutional Neural Networks Acceleration // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 2006-2015. [36] HE Y, KANG G L, DONG X Y, et al. Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks // Proc of the 27th International Joint Conference on Artificial Intelligence. New York, USA: ACM, 2019: 2234-2240. [37] HE Y, LIU P, WANG Z W, et al. Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2019: 4340-4349.