基于稀疏化双线性卷积神经网络的细粒度图像分类

doi:10.16451/j.cnki.issn1003-6059.201904006

摘要
图/表
参考文献
相关文章 (3)

全文: PDF (766 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对双线性卷积神经网络(B-CNN)在细粒度图像分类中因参数过多、复杂度过高而导致的过拟合问题,提出稀疏化B-CNN.首先对B-CNN的每个特征通道引入比例因子,在训练中采用正则化方法对其稀疏.然后利用比例因子的大小判别特征通道的重要性.最后将不重要特征通道按一定比例裁剪,消除网络过拟合,提高关键特征的显著性.稀疏化B-CNN属于弱监督学习,可实现端到端训练.在FGVC-aircraft、Stanford dogs、Stanford cars这3个细粒度图像数据集上的实验表明,稀疏化B-CNN的准确率高于B-CNN,也优于或基本接近其它通用的细粒度图像分类算法.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	马力
	王永雄

关键词 ：细粒度图像分类, 双线性卷积神经网络(B-CNN), 过拟合, 网络稀疏, 网络剪枝

Abstract：The overfitting problem of bilinear convolutional neural network(B-CNN) for fine-grained visual recognition is caused by the large number of parameters and its complex structure. In this paper, a sparse B-CNN is proposed to handle the problem. Firstly, a scaling factor is introduced into each feature channel of B-CNN, and regularization of sparsity is applied to the scaling factors during the training. Then, the feature channels in B-CNN with low contribution to the final classification are identified by small scaling factors. Finally, these channels are pruned in a certain proportion to prevent overfitting and increase the significance of key features. The learning of sparse B-CNN is weakly supervised and end-to-end. The verification experiments on FGVC-aircraft, Stanford dogs and Stanford cars fine-grained image datasets show that the accuracy of sparse B-CNN is higher than that of the original B-CNN. Moreover, compared with other advanced algorithms for fine-grained visual recognition, the performance of sparse B-CNN is same or even better.

Key words： Fine-Grained Visual Recognition Bilinear Convolutional Neural Network(B-CNN)
Overfitting Network Sparsity Network Pruning

收稿日期: 2019-01-03

ZTFLH:

TP 391

基金资助:国家自然科学基金项目(No.61673276,61703277)资助

作者简介: 马力,硕士研究生,主要研究方向为计算机视觉、图像处理.E-mail:mali1906@163.com.王永雄(通讯作者),博士,教授,主要研究方向为智能机器人及视觉.E-mail:wyxiong@usst.edu.cn.

引用本文:

马力, 王永雄. 基于稀疏化双线性卷积神经网络的细粒度图像分类[J]. 模式识别与人工智能, 2019, 32(4): 336-344. MA Li, WANG Yongxiong. Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network. , 2019, 32(4): 336-344.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201904006 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2019/V32/I4/336

[1] FARRELL R, OZA O, ZHANG N, et al. Birdlets: Subordinate Categorization Using Volumetric Primitives and Pose-Normalized Appearance // Proc of the International Conference on Computer Vision. Washington, USA: IEEE, 2011: 161-168.
[2] ZHANG N, FARRELL R, DARRELL T. Pose Pooling Kernels for Sub-category Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2012: 3665-3672.
[3] 罗建豪,吴建鑫.基于深度卷积特征的细粒度图像分类研究综述.自动化学报, 2017, 43(8): 1306-1318.
(LUO J H, WU J X. A Survey on Fine-Grained Image Categorization Using Deep Convolutional Features. Acta Automatica Sinica, 2017, 43(8): 1306-1318.)
[4] CUI Y, SONG Y, SUN C, et al. Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4109-4118.
[5] WU L, WANG Y, LI X, et al. Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition. IEEE Tran-sactions on Cybernetics, 2019, 49(5): 1791-1802.
[6] LIN T Y, ROYCHOWDHURY A, MAJI S. Bilinear CNN Models for Fine-Grained Visual Recognition // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1449-1457.
[7] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1409.1556.pdf.
[8] LECUN Y, DENKER J S, SOLLA S A. Optimal Brain Damage // TOURETZKY D S, ed. Advances in Neural Information Processing Systems 2. San Francisco, USA: Morgan Kaufmann Publishers, 1990: 598-605.
[9] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving Neural Networks by Preventing Co-adaptation of Feature Detectors[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1207.0580.pdf.
[10] QUINLAN J R. Bagging, Boosting, and C4. 5 // Proc of the 13th National Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 1996, I: 725-730.
[11] IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1502.03167v3.pdf.
[12] 曹文龙,芮建武,李敏.神经网络模型压缩方法综述.计算机应用研究, 2019, 36(3): 649-656.
(CAO W L, BING J W, LI M. Survey on Neural Network Model Compression Methods. Application Research of Computers, 2019, 36(3): 649-656.)
[13] DENIL M, SHAKIBI B, DINH L, et al. Predicting Parameters in Deep Learning[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1306.0543.pdf.
[14] HAN S, MAO H Z, DALLY W J. Deep Compression: Compre-ssing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1510.00149.pdf.
[15] WEN W, WU C P, WANG Y D, et al. Learning Structured Sparsity in Deep Neural Networks[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1608.03665.pdf.
[16] LIU Z, LI J G, SHEN Z Q, et al. Learning Efficient Convolutional Networks through Network Slimming // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2755-2763.
[17] LI H, KADAV A, DURDANOVIC I, et al. Pruning Filters for Efficient Convents[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1608.08710.pdf.
[18] MAJI S, RAHTU E, KANNALA J, et al. Fine-Grained Visual Classification of Aircraft[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1306.5151.pdf.
[19] KHOSLA A, JAYADEVAPRAKASH N, YAO B P, et al. Novel Dataset for Fine-Grained Image Categorization: Stanford Dogs[C/OL]. [2018-12-06]. http://59.80.44.98/people.csail.mit.edu/khosla/papers/fgvc2011.pdf.
[20] KRAUSE J, STARK M, DENG J, et al. 3D Object Representations for Fine-Grained Categorization // Proc of the IEEE International Conference on Computer Vision Workshops. Washington, USA: IEEE, 2013: 554-561.
[21] CHATFIELD K, SIMONYAN K, VEDALDI A, et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1405.3531.pdf.
[22] GOSSELIN P H, MURRAY N, JÉGOU H, et al. Revisiting the Fisher Vector for Fine-Grained Classification. Pattern Recognition Letters, 2014, 49: 92-98.
[23] 冯语姗,王子磊.自上而下注意图分割的细粒度图像分类.中国图象图形学报, 2016, 21(9): 1147-1154.
(FENG Y S, WANG Z L. Fine-Grained Image Categorization with Segmentation Based on Top-Down Attention Map. Journal of Image and Graphics, 2016, 21(9): 1147-1154.)
[24] SIMON M, RODNER E. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks // Proc of the IEEE International Conference on Computer Vision. Wa-shington, USA: IEEE, 2015: 1143-1151.
[25] ZHANG X P, XIONG H K, ZHOU W G, et al. Picking Deep Filter Responses for Fine-Grained Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 1134-1142.
[26] ZHAO B, WU X, FENG J S, et al. Diversified Visual Attention Networks for Fine-Grained Object Classification. IEEE Transactions on Multimedia, 2017, 19(6): 1245-1256.
[27] LIU X, XIA T, WANG J, et al. Fully Convolutional Attention Networks for Fine-Grained Recognition[C/OL]. [2018-12-06]. https://arxiv.org/pdf/1603.06765.pdf.
[28] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587.
[29] KRAUSE J, JIN H L, YANG J C, et al. Fine-Grained Recognition without Part Annotations // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 5546-5555.