1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116 2. Engineering Research Center of Mine Digitization of Ministry of Education, China University of Mining and Technology, Xu-zhou 221116 3. Innovation Research Center of Disaster Intelligent Prevention and Emergency Rescue, China University of Mining and Tech-nology, Xuzhou 221116
Abstract:With the rapid development of remote sensing technology, object detection for remote sensing image is widely applied in many fields ,such as resource exploration, urban planning and natural disaster assessment. Aiming at the complex background and the small target scale of remote sensing images, an interpretable object detection method for remote sensing image based on deep reinforcement learning is proposed. Firstly, deep reinforcement learning is applied to the region proposal network in faster region-convolutional neural network to improve the detection accuracy of remote sensing images by modifying the excitation function. Secondly, the detection speed and portability of the model are improved by lightening the original backbone network with a large number of parameters. Finally, the interpretability of the hidden layer representation in the model is quantified using the network anatomy method to endow the model with an interpretable concept of human understanding. Experiments on three public remote sensing datasets show that the performance of the proposed method is improved and the effectiveness of the proposed method is verified by the improved network anatomy method.
[1] 张 辉,刘万军,吕欢欢.小波核局部Fisher判别分析的高光谱遥感影像特征提取.模式识别与人工智能, 2019, 32(7): 624-632. (ZHANG H, LIU W J, LÜ H H. Feature Extraction for Hyperspectral Remote Sensing Image Based on Local Fisher Discriminant Analysis with Wavelet Kernel. Pattern Recognition and Artificial Intelligence, 2019, 32(7): 624-632.) [2] 储 珺,朱晓阳,冷 璐,等.引入通道注意力和残差学习的目标检测器.模式识别与人工智能, 2020, 33(10): 889-897. (CHU J, ZHU X Y, LENG L, et al. Target Detector with Channel Attention and Residual Learning. Pattern Recognition and Artificial Intelligence, 2020, 33(10): 889-897.) [3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 580-587. [4] GIRSHICK R. Fast R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 1440-1448. [5] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [6] REDMON J, FARHADI A. YOLO9000: Better, Faster, Stronger // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 6517-6525. [7] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Mul-tibox Detector // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37. [8] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-Time Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 779-788. [9] LAW H, DENG J. CornerNet: Detecting Objects as Paired Keypoints. International Journal of Computer Vision, 2020, 128: 642-656. [10] ZHOU X Y, ZHUO J C, KRÄHENBÜHL P. Bottom-Up Object Detection by Grouping Extreme and Center Points // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 850-859. [11] TIAN Z, SHEN C H, CHEN H, et al. FCOS: Fully Convolutional One-Stage Object Detection // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 9626-9635. [12] WANG Z Y, SCHAUL T, HESSEL M, et al. Dueling Network Architectures for Deep Reinforcement Learning // Proc of the 33rd International Conference on Machine Learning. New York, USA: ACM, 2016: 1995-2003. [13] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-Level Control through Deep Reinforcement Learning. Nature, 2015, 518(7540): 529-533. [14] HAN X N, LIU H P, SUN F C, et al. Active Object Detection Using Double DQN and Prioritized Experience Replay // Proc of the International Joint Conference on Neural Networks. Washington, USA: IEEE, 2018. DOI: 10.1109/IJCNN.2018.8489296. [15] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized Experience Replay // Proc of the International Conference on Learning Representations[C/OL]. [2021-04-15]. https://arxiv.org/pdf/1511.05952.pdf. [16] ZHANG Q S, CAO R M, WU Y N, et al. Growing Interpretable Part Graphs on ConvNets via Multi-shot Learning // Proc of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2017: 2898-2906. [17] BAEHRENS D, SCHROETER T, HARMELING S, et al. How to Explain Individual Classification Decisions. Journal of Machine Learning Research, 2010, 11: 1803-1831. [18] ZEILER M D, FERGUS R. Visualizing and Understanding Convolutional Networks // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2014, I: 818-833. [19] ERHAN D, BENGIO Y, COURVILLE A, et al. Visualizing Higher-Layer Features of a Deep Network. Technical Report, 1341. Montreal, Canada: University of Montreal, 2009. [20] PIRINEN A, SMINCHISESCU C. Deep Reinforcement Learning of Region Proposal Networks for Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 6945-6954. [21] 万里鹏,兰旭光,张翰博,等.深度强化学习理论及其应用综述.模式识别与人工智能, 2019, 32(1): 67-81. (WAN L P, LAN X G, ZHANG H B, et al. A Review of Deep Reinforcement Learning Theory and Application. Pattern Recognition and Artificial Intelligence, 2019, 32(1): 67-81.) [22] PINTO L, DAVIDSON J, SUKTHANKAR R, et al. Robust Adversarial Reinforcement Learning // Proc of the 34th International Conference on Machine Learning. New York, USA: ACM, 2017: 2817-2826. [23] CHENG G, ZHOU P C, HAN J W. Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7405-7415. [24] XIA G S, BAI X, DING J, et al. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 3974-3983. [25] WEN L Y, ZHU P F, DU D W, et al. VisDrone-SOT2018: The Vision Meets Drone Single-Object Tracking Challenge Results // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 469-495. [26] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2021-04-15]. https://arxiv.org/pdf/1409.1556v4.pdf. [27] HAASE D, AMTHOR M. Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 14600-14609. [28] CHOLLET F. Xception: Deep Learning with Depthwise Separable Convolutions // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 1800-1807. [29] ZHOU B L, ZHAO H, PUIG X, et al. Scene Parsing through ADE20K Dataset // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017:5122-5130. [30] BELL S, BALA K, SNAVELY N. Intrinsic Images in the Wild. ACM Transactions on Graphics, 2014, 33(4): 1-12. [31] MOTTAGHI R, CHEN X J, LIU X B, et al. The Role of Context for Object Detection and Semantic Segmentation in the Wild // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 891-898. [32] CHEN X J, MOTTAGHI R, LIU X B, et al. Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 1979-1986. [33] CIMPOI M, MAJI S, KOKKINOS I, et al. Describing Textures in the Wild // Proc of the IEEE International Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 3606-3613. [34] 王 浩,单文静,方宝富.基于多层上下文卷积神经网络的目标检测算法.模式识别与人工智能, 2020, 33(2): 113-120. (WANG H, SHAN W J, FANG B F. Multi-layers Context Convolutional Neural Network for Object Detection. Pattern Recognition and Artificial Intelligence, 2020, 33(2): 113-120.) [35] 张绳昱,董士风,焦 林,等.基于有效感受野的区域推荐网络.模式识别与人工智能, 2020, 33(5): 393-400. (ZHANG S Y, DONG S F, JIAO L, et al. Region Proposal Network Based on Effective Receptive Field. Pattern Recognition and Artificial Intelligence, 2020, 33(5): 393-400.) [36] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Cla-ssification with Deep Convolutional Neural Networks. Communications of the ACM, 2017, 60(6): 84-90.