改进YOLOv3网络结构的遮挡行人检测算法

doi:10.16451/j.cnki.issn1003-6059.202006010

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (1502 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对YOLOv3算法在监控视频行人检测中对遮挡目标漏检率较高的问题,文中提出改进YOLOv3网络结构的遮挡行人检测算法.首先在网络全连接层引入空间金字塔池化网络,增强网络的多尺度特征融合能力.然后采用网络剪枝的方式,精简网络冗余结构,避免网络层数加深导致的退化和过拟合问题,同时减少参数量.在走廊行人数据集上进行多尺度训练,获得最优的权重模型.实验表明,文中方法在平均准确率和检测速度上都有所提升.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	刘丽
	郑洋
	付冬梅

关键词 ：行人检测, 深度学习, YOLOv3, 空间金字塔池化网络, 网络剪枝

Abstract：Aiming at high missed detection rates of YOLOv3 for occluded pedestrian in surveillance video, a detection method for occluded pedestrian based on improved network structure of YOLOv3 is proposed. Firstly, the spatial pyramid pooling network is introduced into the fully connected layer to enhance the multi-scale feature fusion capability of the network. Secondly, the network structure pruning is employed to eliminate the network structure redundancy to avoid network degeneration and overfitting problem caused by the deepening of network layers and reduce the amount of parameters. Multi-scale training is performed on the corridor pedestrian dataset to obtain the best weight model. Experimental results indicate the improvement of average accuracy and detection speed of the proposed algorithm.

Key words： Pedestrian Detection Deep Learning YOLOv3 Spatial Pyramid Pooling Network Network Pruning

收稿日期: 2020-03-05

ZTFLH:

TP 391.4

基金资助:北京科技大学中央高校基本科研业务费专项资金(No.FRF-BD-19-002A)资助

通讯作者: 刘丽,博士,教授,主要研究方向为计算机网络、模式识别.E-mail:liuli@ustb.edu.cn.

作者简介: 郑洋,硕士研究生,主要研究方向为深度学习、目标检测跟踪.E-mail:1184919257@qq.com. 付冬梅,博士,教授,主要研究方向为深度学习、模式识别.E-mail:fdm2003@163.com.

引用本文:

刘丽, 郑洋, 付冬梅. 改进YOLOv3网络结构的遮挡行人检测算法[J]. 模式识别与人工智能, 2020, 33(6): 568-574. LIU Li, ZHENG Yang, FU Dongmei. Occluded Pedestrian Detection Algorithm Based on Improved Network Structure of YOLOv3. , 2020, 33(6): 568-574.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.202006010 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2020/V33/I6/568

[1] 吴群,王田,王汉武,等.现代智能视频监控研究综述.计算机应用研究, 2016, 33(6): 1601-1606.
(WU Q, WANG T, WANG H W, et al. Survey on Modern Intelligent Video Surveillance. Application Research of Computers, 2016, 33(6): 1601-1606.)
[2] 张雅俊,高陈强,李佩,等.基于卷积神经网络的人流量统计.重庆邮电大学学报(自然科学版), 2017, 29(2): 265-271.
(ZHANG Y J, GAO C Q, LI P, et al. Pedestrian Counting Based on Convolutional Neural Network. Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition), 2017, 29(2): 265-271.)
[3] 屈晶晶,辛云宏.连续帧间差分与背景差分相融合的运动目标检测方法.光子学报, 2014, 43(7): 213-220.
(QU J J, XIN Y H. Combined Continuous Frame Difference with Background Difference Method for Moving Object Detection. Acta Photonica Sinica, 2014, 43(7): 213-220.)
[4] YADAV R P, SENTHAMILARASU V, KUTTY K, et al. Implementation of Robust HOG-SVM Based Pedestrian Classification. International Journal of Computer Applications, 2015, 114(19): 10-16.
[5] 郭烈,王荣本,张明恒,等.基于Adaboost算法的行人检测方法.计算机工程, 2008, 34(3): 202-204.
(GUO L, WANG R B, ZHANG M H, et al. Pedestrian Detection Method Based on Adaboost Algorithm. Computer Engineering, 2008, 34(3): 202-204.)
[6] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587.
[7] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448.
[8] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.
[9] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective Search for Object Recognition. International Journal of Computer Vision, 2013, 104(2): 154-171.
[10] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multibox Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37.
[11] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016. 779-788.
[12] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.
[13] REDMON J, FARHADI A. YOLO9000: Better, Faster, Stronger // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6517-6525.
[14] REDMON J, FARHADI A. YOLOv3: An Incremental Improvement[C/OL]. [2020-03-01]. https://arxiv.org/pdf/1804.02767.pdf.
[15] ZHANG P Y, ZHONG Y X, LI X Q. SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications[C/OL]. [2020-03-01]. https://arxiv.org/ftp/arxiv/papers/1907/1907.11093.pdf.
[16] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regre-ssion // Proc of the IEEE Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2019. DOI: 10.1109/CVPR.2019.00075.
[17] BRAUN M, KREBS S, FLOHR F, et al. EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes. IEEE Tran-sactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1844-1861.
[18] HE K M, ZHANG X Y, REN S Q, et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[19] LIU Z, LI J G, SHEN Z Q, et al. Learning Efficient Convolutional Networks through Network Slimming // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2755-2763.
[20] CUBUK E D, ZOPH B, MANE D, et al. Auto Augment: Learning Augmentation Policies from Data[C/OL]. [2020-03-01]. https://arxiv.org/pdf/1805.09501.pdf.