Abstract:The traditional deep neural network cannot be deployed on the edge devices with limited computing capacity due to numerous parameters and high computation. In this paper, a lightweight network based on neural architecture search is specially designed to solve this problem. Convolution units of different groups are regarded as search space, and neural architecture search is utilized to obtain both the group structure and the overall architecture of the network. In the meanwhile, a cycle annealing search strategy is put forward to solve the multi-objective optimization problem of neural architecture search with the consideration of the accuracy and the computation cost of the model. Experiments on datasets show that the proposed network model achieves a better performance than the state-of-the-art methods.
姚潇, 史叶伟, 霍冠英, 徐宁. 基于神经网络结构搜索的轻量化网络构建[J]. 模式识别与人工智能, 2021, 34(11): 1038-1048.
YAO Xiao, SHI Yewei, HUO Guanying, XU Ning. Lightweight Model Construction Based on Neural Architecture Search. , 2021, 34(11): 1038-1048.
[1] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [2] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2020-12-10]. https://arxiv.org/pdf/1409.1556.pdf. [3] YOUNG S L, ZHE W, TANBMAN D, et al. Transform Quantiza-tion for CNN Compression[C/OL]. [2020-12-10]. https://arxiv.org/ftp/arxiv/papers/2009/2009. 01174.pdf. [4] YU Z Z, SHI Y M, HUANG T J, et al. Kernel Quantization for Efficient Network Compression[C/OL]. [2020-12-10]. https://arxiv.org/pdf/2003.05148.pdf. [5] GONG C, CHEN Y, LU Y, et al. VecQ: Minimal Loss DNN Model Compression with Vectorized Weight Quantization[C/OL]. [2020-12-10]. https://arxiv.org/pdf/ 2005.08501.pdf. [6] HE Y, LIU P, WANG Z W, et al. Pruning Filter via Geometric Median for Deep Convolutional Neural Networks Acceleration // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 4335-4354. [7] HE Y, KANG G L, DONG X Y, et al. Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks // Proc of the 27th International Joint Conference on Artificial Intelligence. New York, USA: ACM, 2018: 2234-2240. [8] LIU X G, WU L S, DAI C, et al. Compressing CNNs Using Multi-level Filter Pruning for the Edge Nodes of Multimedia Internet of Things. IEEE Internet of Things Journal, 2021. DOI: 10.1109/JIOT.2021.3052016. [9] SANDLER M, HOWARD A, ZHU M L, et al. Mobilenetv2: Inverted Residuals and Linear Bottlenecks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4510-4520. [10] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 6848-6856. [11] HINTON G, VINYALS O, DEAN J. Distilling the Knowledge in a Neural Network[C/OL]. [2020-12-10]. https://arxiv.org/pdf/1503.02531.pdf. [12] WU B C, WAN A, YUE X Y, et al. Shift: A Zero Flop, Zero Parameter Alternative to Spatial Convolutions // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 9127-9135. [13] DENTON E, ZAREMBA W, BRUNA J, et al. Exploiting Linear Structure within Convolutional Networks for Efficient Evaluation // Proc of the 28th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2014: 1269-1277. [14] SHI Y W, YAO X, CHEN R X, et al. Image Recognition Based on Multi-scale Dilated Lightweight Network Model // Proc of the 5th International Conference on Multimedia and Image Processing. New York, USA: ACM, 2020: 43-48. [15] ZHANG Z Y, LI J Y, SHAO W Q, et al. Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 3542-3551. [16] PHAM H, CUAN M, ZOPH B, et al. Efficient Neural Architecture Search via Parameters Sharing // Proc of the 35th International Conference on Machine Learning. New York, USA: ACM, 2018: 4095-4104. [17] WILLIAMS R J. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine Learning, 1992, 8(3/4): 229-256. [18] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-Based Lear-ning Applied to Document Recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324. [19] KRIZHEVSKY A. Learning Multiple Layers of Features from Tiny Images[R/OL]. [2020-12-10]. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf. [20] LOSHCHILOV I, HUTTER F. SGDR: Stochastic Gradient De-scent with Warm Restarts[C/OL]. [2020-12-10]. https://arxiv.org/pdf/1608.03983v5.pdf. [21] LIU H X, SIMONYAN K, YANG Y M. DARTS: Differentiable Architecture Search[C/OL]. [2020-12-10]. https://openreview.net/pdf?id=S1eYHoC5FX. [22] YU R C, LI A, CHEN C F, et al. NISP: Pruning Networks Using Neuron Importance Score Propagation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 9194-9203. [23] LIN S H, JI R R, YAN C Q, et al. Towards Optimal Structured CNN Pruning via Generative Adversarial Learning // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2785-2794. [24] WEN W, WU C P, WANG Y D, et al. Learning Structured Spa-rsity in Deep Neural Networks[C/OL]. [2020-12-10]. https://arxiv.org/pdf/1608.03665v4.pdf. [25] BYERLY A, KALGANOVA T, DEAR I. A Branching and Merging Convolutional Network with Homogeneous Filter Capsules[C/OL]. [2020-12-10]. https://arxiv.org/pdf/2001.09136.pdf.