Abstract:In micro-operating system, traditional object detection method cannot detect the objects with partial occlusion and multiple poses, and thus an improved faster region convolutional neural network(Faster RCNN) is adopted to solve the problem. On the basis of original Faster RCNN, deep residual network exhibiting excellent performance in image classification is introduced as the framework of the algorithm, and online hard example mining strategy to enhance the performance by alleviating the imbalance between positive and negative examples is employed. The experimental results manifest that the proposed method can detect objects with partial occlusion and multiple poses effectively. The proposed method shows strong adaptability to environment, responds quickly compared with traditional methods, and thus the practicality of it is verified.
彭刚, 杨诗琪, 黄心汉, 苏豪. 改进的基于区域卷积神经网络的微操作系统目标检测方法[J]. 模式识别与人工智能, 2018, 31(2): 142-149.
PENG Gang, YANG Shiqi, HUANG Xinhan, SU Hao. Improved Object Detection Method of Micro-operating System Based on Region Convolutional Neural Network. , 2018, 31(2): 142-149.
[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks // PEREIRA F, BURGES C J C, BOTTOU L, et al., eds. Advances in Neural Information Processing Systems 25. Cambridge, USA: The MIT Press, 2012: 1097-1105. [2] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587. [3] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448. [4] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[C/OL]. [2017-08-21]. https://arxiv.org/pdf/1506.01497.pdf. [5] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 779-788. [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multibox Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37. [7] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [8] SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training Region-Based Object Detectors with Online Hard Example Mining // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 761-769. [9] DENG J, DONG W, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 248-255. [10] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J/OL]. [2017-08-21]. https://arxiv.org/pdf/1409.1556v6.pdf. [11] SZEGEDY C, LIU W, JIA Y Q, et al. Going Deeper with Convolutions // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015. DOI: 10.1109/CVPR.2015.7298594. [12] IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C/OL]. [2017-08-21]. https://arxiv.org/pdf/1502.03167v3.pdf. [13] VEIT A, WILBER M J, BELONGIE S. Residual Networks Behave Like Ensembles of Relatively Shallow Networks // LEE D D, SUGIYAMA M, LUXBURG U V, et al., eds. Advances in Neural Information Processing Systems 29. Cambridge, USA: The MIT Press, 2016: 550-558. [14] HE K M, ZHANG X Y, REN S Q, et al. Identity Mappings in Deep Residual Networks // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 630-645. [15] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective Search for Object Recognition. International Journal of Computer Vision, 2013, 104(2): 154-171. [16] HE K M, ZHANG X Y, REN S Q, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 346-361. [17] SU J, HUANG X H, WANG M. Pose Detection of Partly Covered Target in Micro-Vision System // Proc of the 10th World Congress on Intelligent Control and Automation. Washington, USA: IEEE, 2012: 4721-4725.