Abstract:Super-resolution reconstruction of visible-light remote sensing images requires collaborative optimization of local texture recovery and long-range structural consistency. Although traditional Transformer networks can model long-range dependencies, they lack sufficient sensitivity to high-frequency textures. To address this issue, a visible-light remote sensing image super-resolution network based on space-frequency alternating self-attention(SFASR) is proposed. Local textures and cross-regional long-range dependencies are modeled respectively through serially alternating frequency domain and spatial domain self-attention. Specifically, a phase-aware frequency self-attention mechanism is designed to enable frequency domain self-attention computation, explicitly modeling phase differences for enhanced high-frequency texture reconstruction. Furthermore, a channel-enhanced permutation self-attention mechanism is constructed to implement spatial domain self-attention computation. By incorporating channel attention, the mechanism strengthens feature representation and global structural consistency. Experimental results show that SFASR effectively addresses the issues of high-frequency information loss and structural breakage, and improves image reconstruction quality.
[1] ZHU C, LIU Y, HUANG S, et al. Taming a Diffusion Model to Revitalize Remote Sensing Image Super-Resolution. Remote Sensing, 2025, 17(8). DOI: 10.3390/rs17081348. [2] XU Y M, GUO T B, WANG C F.A Remote Sensing Image Super-Resolution Reconstruction Model Combining Multiple Attention Me-chanisms. Sensors, 2024, 24(14). DOI: 10.3390/s24144492. [3] LU Y T, WANG S Z, WANG B L, et al. Enhanced Window-Based Self-Attention with Global and Multi-scale Representations for Remote Sensing Image Super-Resolution. Remote Sensing, 2024, 16(15). DOI: 10.3390/rs16152837. [4] LIU C C, ZHANG D Y, LU G M, et al. SRMamba-T: Exploring the Hybrid Mamba-Transformer Network for Single Image Super-Re-solution. Neurocomputing, 2025, 624. DOI: 10.1016/j.neucom.2025.129488. [5] DONG C, LOY C C, HE K M, et al. Learning a Deep Convolutio-nal Network for Image Super-Resolution // Proc of the 13th European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 184-199. [6] 赵小强,李希尧,宋昭漾.轻量化逆可分离残差信息蒸馏网络的图像超分辨率重建.模式识别与人工智能, 2023, 36(5): 419-432. (ZHAO X Q, LI X Y, SONG Z Y.Lightweight Inverse Separable Residual Information Distillation Network for Image Super-Resolution Reconstruction. Pattern Recognition and Artificial Intelligence, 2023, 36(5): 419-432.) [7] LIEBEL L, KÖRNER M. Single-Image Super Resolution for Multispectral Remote Sensing Data Using Convolutional Neural Networks[C/OL].[2025-07-24]. https://isprs-archives.copernicus.org/articles/XLI-B3/883/2016/isprs-archives-XLI-B3-883-2016.pdf. [8] VASWANI A, SHAZEER N, PARMAR N, et al. Attention Is All You Need // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 5999-6009. [9] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image Is Worth 16×16 Words: Transformers for Image Recognition at Scale[C/OL].[2025-07-24]. https://arxiv.org/pdf/2010.11929. [10] LIANG J Y, CAO J Z, SUN G L, et al. SwinIR: Image Restoration Using Swin Transformer // Proc of the IEEE/CVF Internatio-nal Conference on Computer Vision Workshops. Washington, USA: IEEE, 2021: 1833-1844. [11] ZHOU Y P, LI Z, GUO C L, et al. SRFormer: Permuted Self-Attention for Single Image Super-Resolution // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2023: 12734-12745. [12] CHEN Z, ZHANG Y L, GU J J, et al. Recursive Generalization Transformer for Image Super-Resolution[C/OL]. [2025-07-24]. https://arxiv.org/pdf/2303.06373v1 [13] YE C J, YAN L Y, ZHANG Y C, et al. A Super-Resolution Method of Remote Sensing Image Using Transformers // Proc of the 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications. Washington, USA: IEEE, 2021: 905-910. [14] LEI S, SHI Z W, MO W J.Transformer-Based Multistage Enhan-cement for Remote Sensing Image Super-Resolution. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60. DOI: 10.1109/TGRS.2021.3136190. [15] SHANG J R, GAO M L, LI Q L, et al. Hybrid-Scale Hierarchical Transformer for Remote Sensing Image Super-Resolution. Remote Sensing, 2023, 15(13). DOI: 10.3390/rs15133442. [16] XIAO Y, YUAN Q Q, JIANG K, et al. TTST: A Top-k Token Selective Transformer for Remote Sensing Image Super-Resolution. IEEE Transactions on Image Processing, 2024, 33: 738-752. [17] SUN L, DONG J X, TANG J H, et al. Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2023: 13144-13153. [18] KONG L S, DONG J X, GE J J, et al. Efficient Frequency Domain-Based Transformers for High-Quality Image Deblurring // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 5886-5895. [19] 王庭伟,赵建伟,周正华.基于轻量级对称CNN-Transformer的图像超分辨率重建方法.模式识别与人工智能, 2024, 37(7): 626-637. (WANG T W, ZHAO J W, ZHOU Z H.Image Super-Resolution Reconstruction Method Based on Lightweight Symmetric CNN-Transformer. Pattern Recognition and Artificial Intelligence, 2024, 37(7): 626-637.) [20] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient Cha-nnel Attention for Deep Convolutional Neural Networks // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 11531-11539. [21] XIA G S, HU J W, HU F, et al. AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965-3981. [22] DAI D X, YANG W.Satellite Image Classification via Two-Layer Sparse Coding with Biased Image Representation. IEEE Geoscience and Remote Sensing Letters, 2010, 8(1): 173-176. [23] WANG X T, XIE L B, DONG C, et al. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 1905-1914. [24] YUAN F, HUANG L F, YAO Y.An Improved PSNR Algorithm for Objective Video Quality Evaluation // Proc of the Chinese Control Conference. Washington, USA: IEEE, 2007: 376-380. [25] WANG Z, BOVIK A C, SHEIKH H R, et al. Image Quality Asse-ssment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. [26] LIM B, SON S, KIM H, et al. Enhanced Deep Residual Networks for Single Image Super-Resolution // Proc of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2017: 1132-1140. [27] ZHANG Y L, LI K P, LI K, et al. Image Super-Resolution Using Very Deep Residual Channel Attention Networks // Proc of the 15th European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 286-301. [28] CHEN Z, ZHANG Y L, GU J J, et al. Cross-Aggregation Transformer for Image Restoration // Proc of the 36th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2022: 25478-25490. [29] WANG J R, WANG B L, WANG X X, et al. Hybrid Attention-Based U-Shaped Network for Remote Sensing Image Super-Resolution. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61. DOI: 10.1109/TGRS.2023.3283769. [30] LIN M Y, ZHANG X X, TIAN Y, et al. Multi-signal Detection Framework: A Deep Learning Based Carrier Frequency and Bandwidth Estimation. Sensors, 2022, 22(10). DOI: 10.3390/s22103909.