融合多维空洞卷积算子和多层次特征的深度网络检测算法

doi:10.16451/j.cnki.issn1003-6059.202010004

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (2964 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在基于深度网络的目标检测模型中,仅利用串行的卷积操作,模型会缺少描述网络不同层次的细节信息和特征图全局信息的能力,减弱小目标的检测能力,影响检测精度.基于残差网络结构,文中提出融合多维空洞卷积(MDC)算子和多层次特征的深度网络检测算法.首先设计MDC算子,卷积核具有5种不同的感受野,可获取8种不同语义的特征图,并引入串行网络的特征提取环节,构造特征层.再通过转置卷积操作实现检测层升维,用于级联不同层次的特征层,得到检测层并保证能在最大程度上保留目标的原始特征.最后使用非极大抑制完成检测算法的构建.实验表明,文中算法有效提高目标平均检测精度和小目标的检测能力.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张新良
	谢恒
	赵运基
	王琬如
	魏胜强

关键词 ：多维空间卷积(MDC)算子, 目标检测, 转置卷积, 细节信息, 全局信息

Abstract：The exclusive usage of sequential convolution operation in the deep networks results in the lack of the target detailed information of feature layers and global characteristics. The detection performance for small objects and the detection accuracy are reduced. In this paper, a deep networks detection algorithm fusing multiple dilated convolution(MDC) operator and multi-level characteristics is proposed based on the residual network structure. The convolution kernel is composed of 5 different receptive fields and 8 different semantic feature maps can be generated. The MDC operator is introduced into the feature extraction block to build a new feature layer. The transposition convolution is employed to increase the dimension of the detection layer and make a collage of multi-level feature layers. Thus, the original features of the targets can be retained in the newly generated detection layer to the most extent. Finally, the detection model is constructed by the non-maximal suppression. The experimental results show that the proposed model with the multi-leveled features and MDC operator can effectively improve the mean average precision and detection performance for small targets.

Key words： Multiple Dilated Convolution(MDC) Operator Target Detection Transposition Convolution Detailed Information Global Information

收稿日期: 2020-07-13

ZTFLH:

TP391

基金资助:河南省高等学校重点科研项目(No.21A120004)、河南省创新型科技人才队伍建设工程(No.CXTD2016054)、中原高水平人才专项支持计划(No.ZYQR201912031)、河南理工大学基础科研基金项目(No.NSFRF170501)资助

通讯作者: 赵运基,博士,讲师,主要研究方向为模式识别、智能控制等.E-mail:auyjz@hpu.edu.cn.

作者简介: 张新良,博士,副教授,主要研究方向为智能控制、检测技术、自动化装置等.E-mail:zxldq@hpu.edu.cn.谢恒,硕士研究生,主要研究方向为模式识别、数字图像处理.E-mail:708998966@qq.com.王琬如,硕士研究生,主要研究方向为模式识别、数字图像处理.E-mail:870925329@qq.com.魏胜强,硕士研究生,主要研究方向为模式识别、数字图像处理.E-mail:963306062@qq.com.

引用本文:

张新良, 谢恒, 赵运基, 王琬如, 魏胜强. 融合多维空洞卷积算子和多层次特征的深度网络检测算法[J]. 模式识别与人工智能, 2020, 33(10): 898-905. ZHANG Xinliang, XIE Heng, ZHAO Yunji, WANG Wanru, WEI Shengqiang. Deep Networks Detection Algorithm Fusing Multiple Dilated Convolution Operator and Multi-level Characteristics. , 2020, 33(10): 898-905.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202010004 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2020/V33/I10/898

[1] 张雨丰,郑忠龙,刘华文,等.基于特征图切分的轻量级卷积神经网络.模式识别与人工智能, 2019, 32(3): 237-246.
(ZHANG Y F, ZHENG Z L, LIU H W, et al. A Lightweight Con-volutional Neural Network Architecture with Slice Feature Map. Pa-ttern Recognition and Artificial Intelligence, 2019, 32(3): 237-246.)
[2] 李庆忠,李宜兵,牛炯.基于改进YOLO和迁移学习的水下鱼类目标实时检测.模式识别与人工智能, 2019, 32(3): 193-203.
(LI Q Z, LI Y B, NIU J. Real-Time Detection of Underwater Fish Based on Improved YOLO and Transfer Learning. Pattern Recognition and Artificial Intelligence, 2019, 32(3): 193-203.)
[3] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778.
[4] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective Search for Object Recognition. International Journal of Computer Vision, 2013, 104: 154-171.
[5] 胡正平,何薇,王蒙,等.多层次深度网络融合人脸识别算法.模式识别与人工智能, 2017, 30(5): 448-455.
(HU Z P, HE W, WANG M, et al. Multi-level Deep Network Fused for Face Recognition. Pattern Recognition and Artificial Inte-lligence, 2017, 30(5): 448-455.)
[6] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 779-788.
[7] REDMON J, FARHADI A. YOLO9000: Better, Faster, Stronger // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6517-6525.
[8] REDMON J, FARHADI A. YOLOv3: An Incremental Improvement[C/OL]. [2020-06-12].https://arxiv.org/pdf/1804.02767.pdf.
[9] CAI Z W, FAN Q F, FERIS R S, et al. A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 354-370.
[10] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2014: 580-587.
[11] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2020-06-12].https://arxiv.org/pdf/1409.1556v6.pdf.
[12] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448.
[13] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[14] MA J Q, SHAO W Y, YE H, et al. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. IEEE Transactions on Multimedia, 2018, 20(11): 3111-3122.
[15] BELL S, ZITNICK C L, BALA K, et al. Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 2874-2883.
[16] CUI L S, RUI M A, PEI L V, et al. MDSSD: Multi-scale Deconvolutional Single Shot Detector for Small Objects[C/OL].[2020-06-12]. https://arxiv.org/pdf/1805.07009v3.pdf.
[17] YU F, KOLTUN V. Multi-Scale Context Aggregation by Dilated Convolutions[C/OL]. [2020-06-12].https://arxiv.org/pdf/1511.07122.pdf.
[18] ZHU R, ZHANG S F, WANG X B, et al. ScratchDet: Training Single-Shot Object Detectors from Scratch // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2263-2272.
[19] IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift // Proc of the 32nd International Conference on Machine Learning. Washington, USA: IEEE, 2015: 448-456.
[20] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multibox Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
[21] FU C Y, LIU W, RANGA A, et al. DSSD: Deconvolutional Single Shot Detector[C/OL].[2020-06-12]. https://arxiv.org/pdf/1701.06659.pdf.