Abstract:The feature information of feature maps of different scales cannot be fully utilized by the existing feature pyramid of target detectors, and these detectors are not suitable for the detection of low-resolution image targets and small targets. To solve this problem, a target detector with channel attention mechanism and residual learning block is proposed. Firstly, the channel global attention mechanism is introduced to learn the weights of different channel features in the feature map through the network and thus the global feature information is enhanced effectively. Then, lightweight residual blocks are exploited to highlight small changes of features and improve the detection performance for small targets in low-resolution images. In addition, deep features are merged into the shallow feature maps for prediction to improve the detection accuracy of small targets. The experimental results on standard test datasets show that the proposed target detector is suitable for low-resolution images and obtains a better detection result for small targets.
[1] 王浩,单文静,方宝富.基于多层上下文卷积神经网络的目标检测算法.模式识别与人工智能, 2020, 33(2): 113-120. (WANG H, SHAN W J, FANG B F. Multi-layers Context Convolutional Neural Network for Object Detection. Pattern Recognition and Artificial Intelligence, 2020, 33(2): 113-120.) [2] 吴帅,徐勇,赵东宁.基于深度卷积网络的目标检测综述.模式识别与人工智能, 2018, 31(4): 335-346. (WU S, XU Y, ZHAO D N. Survey of Object Detection Based on Deep Convolutional Network. Pattern Recognition and Artificial Intelligence, 2018, 31(4): 335-346.) [3] SINGH B, DAVIS L S. An Analysis of Scale Invariance in Object Detection Snip // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 3578-3587. [4] SINGH B, NAJIBI M, DAVIS L S. SNIPER: Efficient Multi-scale Training // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2018: 9310-9320. [5] ZHANG K P, ZHANG Z P, LI Z F, et al. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Processing Letters, 2016, 23(10): 1499-1503. [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multi-box Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37. [7] KONG T, SUN F C, YAO A B, et al. RON: Reverse Connection with Objectness Prior Networks for Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 5244-5252. [8] ZHANG S F, WEN L Y, BIAN X, et al. Single-Shot Refinement Neural Network for Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 4203-4212. [9] FANG B F, FANG L. Concise Feature Pyramid Region Proposal Network for Multi-scale Object Detection. The Journal of Supercomputing, 2020, 76(5): 3327-3337. [10] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature Pyramid Networks for Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 936-944. [11] FU C Y, LIU W, RANGA A, et al. DSSD: Deconvolutional Single Shot Detector[C/OL].[2020-05-12]. https://arxiv.org/pdf/1701.06659.pdf. [12] ZHOU P, NI B B, GENG C, et al. Scale-Transferrable Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 528-537. [13] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely Connected Convolutional Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 4700-4708. [14] ZHANG Z S, QIAO S Y, XIE C H, et al. Single-Shot Object Detection with Enriched Semantics // Proc of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 5813-5821. [15] ZHAO Q J, SHENG T, WANG Y T, et al. M2DET: A Single-Shot Object Detector Based on Multi-level Feature Pyramid Network // Proc of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2019: 9259-9266. [16] LI W Q, LIU Z C. A Single-Shot Object Detector with Feature Aggregation and Enhancement[C/OL]. [2020-05-12].https://arxiv.org/ftp/arxiv/papers/1902/1902.02923.pdf. [17] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. [18] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [19] EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The Pascal Visual Object Classes Challenge: A Retrospective. International Journal of Computer Vision, 2015, 111: 98-136. [20] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft Coco: Common Objects in Context // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 740-755. [21] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.