Remote Sensing Image Recognition Algorithm Based on Pseudo Global Swin Transformer
WANG Keping1,2, ZUO Xinhao1,2, YANG Yi1,2, FEI Shumin1,3
1. School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo 454003; 2. Henan International Joint Laboratory of Direct Drive and Control of Intelligent Equipment, Henan Polytechnic University, Jiaozuo 454003; 3. School of Automation, Southeast University, Nanjing 210096
Abstract:Determining the core target aligning with human thinking habits in the context of multiple concurrent targets is one of the key factors in remote sensing image recognition. Therefore,the effective allocation of attention in accordance with human visual habits in a global perspective is one of the ways to select core targets. In this paper, combining the concept of extracting features using the Transformer and the advantages of the Swin Transformer in reducing computational complexity through image gridding, a remote sensing image recognition algorithm based on pseudo global Swin Transformer is proposed.The pseudo global Swin Transformer module is built to aggregate the local information of rasterized remote sensing images into a single feature value, replacing the pixel-based global information to obtain global features with smaller computational cost, and thus the perceptual ability of the model for all targets is effectively improved. Meanwhile, by introducing a receptive field adaptive scaling module based on deformable convolutions, the receptive field is shifted towards core targets to enhance the network attention to core target information and then achieve precise recognition of remote sensing images. Experiments on RSSCN7, AID, and OPTIMAL-31 remote sensing image datasets show that the proposed algorithm achieves high recognition accuracy and parameter identification efficiency.
[1] 董世英,吴田军,焦思佳.地块级航空高光谱遥感土地覆盖制图及其精度评估.遥感技术与应用, 2023, 38(2): 353-361. (DONG S Y, WU T J, JIAO S J.Geo-Parcel Airborne Level Hyperspectral Remote Sensing Land Cover Mapping and Accuracy Assessment. Remote Sensing Technology and Applications, 2023, 38(2): 353-361.) [2] 张绪振,王勇峰,马伟,等.遥感图像智能分类技术在自然资源监测工程中的应用研究.资源信息与工程, 2023, 38(3): 79-83. (ZHANG X Z, WANG Y F, MA W,et al. Application of Remote Sensing Image Intelligent Classification Technology in Natural Resources Monitoring Engineering. Resource Information and Enginee-ring, 2023, 38(3): 79-83.) [3] ZHENG L X, WU M Q, ZHAO J, et al. Effects of Ulva Prolifera Dissipation on the Offshore Environment Based on Remote Sensing Images and Field Monitoring Data. Acta Oceanologica Sinica, 2023, 42(6): 112-120. [4] O'SHEA K, NASH R. An Introduction to Convolutional Neural Networks[C/OL].[2023-05-22]. https://arxiv.org/abs/1511.08458. [5] LIU Y F, ZHONG Y F, FEI F, et al. Scene Semantic Classification Based on Random-Scale Stretched Convolutional Neural Network for High-Spatial Resolution Remote Sensing Imagery // Proc of the IEEE International Geoscience and Remote Sensing Symposium. Washington, USA: IEEE, 2016: 763-766. [6] ZHANG X R, SUN Y J, JIANG K, et al. Spatial Sequential Recu-rrent Neural Network for Hyperspectral Image Classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(11): 4141-4155. [7] MOU L C, LU X Q, LI X L, et al. Nonlocal Graph Convolutional Networks for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(12): 8246-8257. [8] SHI C, FANG L, LÜ Z Y, et al. Improved Generative Adversarial Networks for VHR Remote Sensing Image Classification. IEEE Geoscience and Remote Sensing Letters, 2022, 19. DOI: 10.1109/LGRS.2020.3025099. [9] MAGGIORI E, TARABALKA Y, CHARPIAT G, et al. Fully Con-volutional Neural Networks for Remote Sensing Image Classification // Proc of the IEEE International Geoscience and Remote Sensing Symposium. Washington, USA: IEEE, 2016: 5071-5074. [10] TAO Y T, XU M Z, ZHANG F, et al. Unsupervised-Restricted Deconvolutional Neural Network for Very High Resolution Remote-Sensing Image Classification. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(12): 6805-6823. [11] CHEN D X, HU P, DUAN X L.Complex Scene Classification of High Resolution Remote Sensing Images Based on DCNN Model // Proc of the 10th International Workshop on the Analysis of Multitemporal Remote Sensing Images. Washington, USA: IEEE, 2019. DOI: 10.1109/Multi-Temp.2019.8866895. [12] VASWANI A, SHAZEER N, PARMAR N, et al.Attention Is All You Need // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6000-6010. [13] WU B C, XU C F, DAI X L, et al. Visual Transformers: Token-Based Image Representation and Processing for Computer Vision[C/OL].[2023-05-22]. https://arxiv.org/abs/2006.03677. [14] BAZI Y, BASHMAL L, RAHHAL M M A, et al. Vision Transformers for Remote Sensing Image Classification. Remote Sensing, 2021, 13(3). DOI: 10.3390/rs13030516. [15] SCHEIBENREIF L, HANNA J, MOMMERT M, et al. Self-Supervised Vision Transformers for Land-Cover Segmentation and Classification // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2022: 1421-1430. [16] LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 9992-10002. [17] JANNAT F E, WILLIS A R.Improving Classification of Remotely Sensed Images with the Swin Transformer // Proc of the Southeast Conference. Washington, USA: IEEE, 2022: 611-618. [18] HAO S Y, WU B, ZHAO K, et al. Two-Stream Swin Transformer with Differentiable Sobel Operator for Remote Sensing Image Classification. Remote Sensing, 2022, 14(6). DOI: 10.3390/rs14061507. [19] ZHENG F J, LIN S, ZHOU W, et al. A Lightweight Dual-Branch Swin Transformer for Remote Sensing Scene Classification. Remote Sensing, 2023, 15(11). DOI: 10.3390/rs15112865. [20] ZOU Q, NI L H, ZHANG T, et al. Deep Learning Based Feature Selection for Remote Sensing Scene Classification. IEEE Geoscience and Remote Sensing Letters, 2015, 12(11): 2321-2325. [21] XIA G S, HU J W, HU F, et al. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965-3981. [22] WANG Q, LIU S T, CHANUSSOT J, et al. Scene Classification with Recurrent Attention of VHR Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(2): 1155-1167. [23] LIU Z, MAO H Z, WU C Y, et al. A ConvNet for the 2020s // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 11966-11976. [24] YANG Y, NEWSAM S.Bag-of-Visual-Words and Spatial Extensions for Land-Use Classification // Proc of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York, USA: ACM, 2010: 270-279. [25] ZENG Y, PENG J, WU X, et al. Multi-CAM: A Class Activation Mapping Method Based on Multi-scale Feature Fusion // Proc of the 5th International Conference on Artificial Intelligence and Big Data. Washington, USA: IEEE, 2022: 294-298.