基于多层上下文卷积神经网络的目标检测算法

doi:10.16451/j.cnki.issn1003-6059.202002003

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (733 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Insufficient feature information in object detection results in low accuracy of small targets and occluded targets detection. Therefore, multi-layers context convolutional neural network (MLC-CNN) is proposed, and contextual information of multiple layers is extracted to combine local features of objects in object detection. MLC-CNN consists of region proposal network (RPN) sub-network and multi-layers context (MLC) sub-network. RPN sub-network is employed to capture feature vectors with the fixed length as object features, and MLC is employed to obtain the corresponding contextual information of the different feature maps. Finally, two kinds of information are fused. In addition, hard example training is employed to solve the problem of imbalance data. Experiments on PASCAL VOC2007 and PASCAL VOC2012 datasets indicate that mean average precision (mAP) value is improved.

Key words： Object Detection Region Proposal Network (RPN) Multi-layers Context Information (MLC) Feature Fusion

Received: 25 September 2019

ZTFLH:

TP 391.4

Fund:Supported by National Natural Science Foundation of China(No.61872327), Natural Science Foundation of Anhui Province(No.1708085MF146), Fundamental Research Funds for the Central Universities(No.ACAIM190102), Project of Innovation Team of Ministry of Education of China(No.IRT17R32)

Corresponding Authors: FANG Baofu, Ph.D., associate professor. His research interests include machine learning, machine vision and multi-robot systems.

About author:: WANG Hao, Ph.D., professor. His research interests include intelligent computing theory and software, artificial intelligence, machine vision and data mining; SHAN Wenjing, master student. Her research interests include computer application technology, object detection and machine learning.

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	WANG Hao
	SHAN Wenjing
	FANG Baofu

Cite this article:

WANG Hao,SHAN Wenjing,FANG Baofu. Multi-layers Context Convolutional Neural Network for Object Detection[J]. , 2020, 33(2): 113-120.

URL:

http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202002003 OR http://manu46.magtech.com.cn/Jweb_prai/EN/Y2020/V33/I2/113

[1] FANG B F, FANG L. Concise Feature Pyramid Region Proposal Network for Multi-scale Object Detection. The Journal of Supercomputing, 2018. DOI: 10.1007/s11227-018-2569-1.
[2] 李庆忠,李宜兵,牛炯.基于改进YOLO和迁移学习的水下鱼类目标实时检测.模式识别与人工智能, 2019, 32(3): 193-203.
(LI Q Z, LI Y B, NIU J. Real-Time Detection of Underwater Fish Based on Improved YOLO and Transfer Learning. Pattern Recognition and Artificial Intelligence, 2019, 32(3): 193-203.)
[3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587.
[4] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448.
[5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[6] DAI J F, LI Y, HE K M, et al. R-FCN: Object Detection via Region-Based Fully Convolutional Networks // Proc of the 30th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2016: 379-387.
[7] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 779-788.
[8] REDMON J, FARHADI A. YOLO9000: Better, Faster, Stronger // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6517-6525.
[9] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multibox Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
[10] LIU L, OUYANG W L, WANG X G, et al. Deep Learning for Generic Object Detection: A Survey[J/OL]. [2019-09-23]. http://cn.arxiv.org/abs/1809.02165.
[11] DIVVALA S K, HOIEM D, HAYS J H, et al. An Empirical Study of Context in Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 1271-1278.
[12] GALLEGUILLOS C, BELONGIE S. Context Based Object Categorization: A Critical Survey. Computer Vision and Image Understanding, 2010, 114(6): 712-722.
[13] TORRALBA A. Contextual Priming for Object Detection. International Journal of Computer Vision, 2003, 53(2): 169-191.
[14] CHEN X L, GUPTA A. Spatial Memory for Context Reasoning in Object Detection // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 4086-4096.
[15] BELL S, ZITNICK C L, BALA K, et al. Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2874-2883.
[16] LIU Y, WANG R P, SHAN S G, et al. Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 6985-6994.
[17] SHRIVASTAVA A, GUPTA A. Contextual Priming and Feedback for Faster R-CNN // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 330-348.
[18] LI J N, WEI Y C, LIANG X D, et al. Attentive Contexts for Object Detection. IEEE Transactions on Multimedia, 2017, 19(5): 944-954.
[19] YU F, KOLTUN V. Multi-scale Context Aggregation by Dilated Convolutions[C/OL]. [2019-09-23]. https://arxiv.org/pdf/1511.07122v2.pdf.
[20] SHRIVASTAVA A, GUPTA A, GIRSHICK R, et al. Training Region-Based Object Detectors with Online Hard Example Mining // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2016: 761-769.
[21] ZHU Y S, ZHAO C Y, WANG J Q, et al. CoupleNet: Coupling Global Structure with Local Parts for Object Detection // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 4126-4134.
[22] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The Pascal Visual Object Classes(VOC) Challenge. International Journal of Computer Vision, 2010, 88(2): 303-338.
[23] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[24] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature Pyramid Networks for Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2117-2125.
[25] KONG T, YAO A B, CHEN Y R, et al. HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Re-cognition. Washington, USA: IEEE, 2016: 845-853.
[26] GIDARIS S, KOMODAKIS N. Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1134-1142.