|
|
|
| Cross-Domain Object Detection Integrating Heterogeneous Attention and Topological Knowledge Diffusion |
| ZHANG Huili1, SU Ruqi1, ZHU Songhao1, LIANG Zhiwei1 |
| 1. College of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210023 |
|
|
|
|
Abstract To address the challenges of low detection accuracy, strong background interference, and insufficient hard sample mining in complex cross-domain scenarios such as UAV aerial photography, low-light conditions, and foggy weather, a cross-domain object detection method integrating heterogeneous attention and topological knowledge diffusion(HATKD) is proposed. The method is synergistically optimized through three core modules. First, a dynamic fusion feature enhancement(DFFE) module is designed. A dual-path heterogeneous attention mechanism is employed to capture multi-granularity spatial information and multi-granularity channel information, thereby filtering highly transferable features and suppressing background noise. Second, a category-aware topological knowledge diffusion module is designed to construct a global topological structure matrix. Hamiltonian graph theory is introduced to build a category prototype memory bank. Cross-domain semantic relationships are aligned through intra-class compactness and inter-class separability constraints. Finally, a spatial-aware hard sample mining(SAHSM) module is designed to optimize the weights of hard samples through confidence-geometry-feature three-level filtering. Thus, the foreground-background imbalance problem is alleviated and the detection capability for hard samples is improved. Experimental results on four datasets demonstrate the superior performance of the proposed method, particularly in small object detection and large object detection. The feature focusing ability of the proposed method is further confirmed through visual heatmaps. Moreover, ablation experiments validate the necessity of each module.
|
|
Received: 01 December 2025
|
|
|
| Fund:National Natural Science Foundation of China(No.52405065) |
|
Corresponding Authors:
ZHU Songhao, Ph.D., associate professor. His research interests include image processing and artificial intelligence.
|
About author:: ZHANG Huili, Master student. His research interests include image processing and deep learning. SU Ruqi, Master student. His research interests include image processing and deep learning. LIANG Zhiwei, Ph.D., associate profe-ssor. His research interests include image processing and intelligent perception. |
|
|
|
[1] LI Y T, FAN Q S, HUANG H S, et al. A Modified YOLOv8 Detec-tion Network for UAV Aerial Image Recognition. Drones, 2023, 7(5). DOI: 10.3390/drones7050304. [2] CARION N, MASSA F, SYNNAEVE G, et al. End-to-End Object Detection with Transformers // Proc of the 16th European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2020: 213-229. [3] ZHU X Z, SU W J, LU L W, et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection[C/OL]. [2025-11-09]. https://arxiv.org/pdf/2010.04159. [4] KONG Y N, SHANG X F, JIA S J. Drone-DETR: Efficient Small Object Detection for Remote Sensing Image Using Enhanced RT-DETR Model. Sensors, 2024, 24(17). DOI: 10.3390/s24175496. [5] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common Objects in Context // Proc of the 13th European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 740-755. [6] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes(VOC) Challenge. International Jour-nal of Computer Vision, 2009, 88(2): 303-338. [7] DU D W, ZHU P F, WEN L Y, et al. VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results // Proc of the IEEE/CVF International Conference on Computer Vision Workshops. Washington, USA: IEEE, 2019: 213-226. [8] DU D W, QI Y K, YU H Y, et al. The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking // Proc of the 15th European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 375-391. [9] LI B Y, REN W Q, FU D P, et al. Benchmarking Single-Image Dehazing and Beyond. IEEE Transactions on Image Processing, 2019, 28(1): 492-505. [10] LOH Y P, CHAN C S. Getting to Know Low-Light Images with the Exclusively Dark Dataset. Computer Vision and Image Understan-ding, 2019, 178: 30-42. [11] SONG B Y, ZHAO S H, WANG Z D, et al. DAF-DETR: A Dynamic Adaptation Feature Transformer for Enhanced Object Detection in Unmanned Aerial Vehicles. Knowledge-Based Systems, 2025, 323. DOI: 10.1016/j.knosys.2025.113760. [12] JIANG P, WU A M, HAN Y H, et al. Bidirectional Adversarial Training for Semi-supervised Domain Adaptation // Proc of the 29th International Joint Conference on Artificial Intelligence. New York, USA: ACM, 2020: 934-940. [13] LONG M S, CAO Z J, WANG J M, et al. Conditional Adversarial Domain Adaptation // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2018: 1647-1657. [14] CHEN L, HAN J H, WANG Y P. DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment. IEEE Transactions on Image Processing, 2025, 34: 982-994. [15] WESTFECHTEL T, YEH H W, ZHANG D X, et al. Gradual Source Domain Expansion for Unsupervised Domain Adaptation // Proc of the IEEE/CVF Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2024: 1935-1944. [16] ZHU J J, BAI H T, WANG L. Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 3561-3571. [17] BAI Y F, LIU C, YANG R, et al. Enhanced Soft Domain Adaptation for Object Detection in the Dark. Journal of Visual Communication and Image Representation, 2025, 106. DOI: 10.1016/j.jvcir.2024.104337. [18] XIAO H G, ZHOU T T, XIONG S D, et al. Unsupervised Domain-Adaptive Object Detection: An Efficient Method Based on UDA-DETR. Neurocomputing, 2025, 631. DOI: 10.1016/j.neucom.2025.129711. [19] JIANG Z Y, CHEN J, YUAN Y. Structural Consistency Learning for Unsupervised Domain Adaptive Object Detection. Neural Networks, 2025, 191. DOI: 10.1016/j.neunet.2025.107767. [20] SOHAN M, RAM T S, REDDY C V R. A Review on YOLOv8 and Its Advancements // Proc of the International Conference on Data Intelligence and Cognitive Informatics. Berlin, Germany: Springer, 2024: 529-545. [21] TAN M X, LE Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks // Proc of the 36th International Conference on Machine Learning. San Diego, USA: JMLR, 2019: 6105-6114. [22] XIONG Y Z, CHEN H, LIN Z J, et al. Confidence-Based Visual Dispersal for Few-Shot Unsupervised Domain Adaptation // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2023: 11587-11597. [23] ZHANG H, LI F, LIU S L, et al. DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection[C/OL]. [2025-11-09]. https://arxiv.org/pdf/2203.03605. [24] LIU J J, XIE Y H. WDFS-DETR: A Transformer-Based Framework with Multi-scale Attention for Small Object Detection in UAV Engineering Tasks. Results in Engineering, 2025, 27. DOI: 10.1016/j.rineng.2025.105930. [25] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [26] DUAN C Z, WEI Z W, ZHANG C, et al. Coarse-Grained Density Map Guided Object Detection in Aerial Images // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2021: 2789-2798. [27] DENG S T, LI S, XIE K, et al. A Global-Local Self-Adaptive Network for Drone-View Object Detection. IEEE Transactions on Image Processing, 2021, 30: 1556-1569. [28] YANG F, FAN H, CHU P, et al. Clustered Object Detection in Aerial Images // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 8310-8319. [29] JOCHER G, STOKEN A, BOROVEC J, et al. ultralytics/yolov5[EB/OL].[2025-11-09]. https://github.com/ultralytics/ultralytics. [30] WANG A, CHEN H, LIU L H, et al. YOLOv10: Real-Time End-to-End Object Detection // Proc of the 38th International Confe-rence on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2024: 107984-108011. [31] GLENN J. YOLO11[EB/OL]. [2025-11-09].https://github.com/ultralytics/ultralytics. [32] SANG M M, TIAN S W, YU L, et al. OD-DDA: Real-Time Object Detector with Dual Dynamic Adaptation in Variable Scenes. Knowledge-Based Systems, 2025, 320. DOI: 10.1016/j.knosys.2025.113611. [33] NIU Y S, LIN C, JIANG X T, et al. VSTDet: A Lightweight Small Object Detection Network Inspired by the Ventral Visual Pathway. Applied Soft Computing, 2025, 171. DOI: 10.1016/j.asoc.2025.112775. [34] VAN QUANG N, HUY HOANG N, SON HOANG M. LEAF-YOLO: Lightweight Edge-Real-Time Small Object Detection on Aerial Imagery. Intelligent Systems with Applications, 2025, 25. DOI: 10.1016/j.iswa.2025.200484. [35] TANG S Y, ZHANG S, FANG Y N. HIC-YOLOv5: Improved YOLOv5 for Small Object Detection // Proc of the IEEE International Conference on Robotics and Automation. Washington, USA: IEEE, 2024: 6614-6619. [36] XU J T, LI Y L, WANG S J. AdaZoom: Towards Scale-Aware Large Scene Object Detection. IEEE Transactions on Multimedia, 2022, 25: 4598-4609. [37] YIN N Z, LIU C X, TIAN R H, et al. SDPDet: Learning Scale-Separated Dynamic Proposals for End-to-End Drone-View Detection. IEEE Transactions on Multimedia, 2024, 26: 7812-7822. [38] LIU C G, GAO G S, HUANG Z Y, et al. YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images. IEEE Transac-tions on Intelligent Transportation Systems, 2024, 25(10): 13863-13875. [39] ZHANG S L, WANG X J, WANG J Q, et al. Dense Distinct Query for End-to-End Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 7329-7338. [40] PENG B, MA C, CHEN Y F, et al. MTW-DETR: A Multi-task Collaborative Optimization Model for Adverse Weather Object Detection. Pattern Recognition Letters, 2026, 199: 7-12. [41] SI Y Z, XU H Y, ZHU X Z,et al. SCSA: Exploring the Synergistic Effects between Spatial and Channel Attention. Neurocompu-ting, 2025, 634. DOI: 10.1016/j.neucom.2025.129866. [42] ZHAO Y, LÜ W Y, XU S L, et al. DETRs Beat YOLOs on Real-Time Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2024: 16965-16974. [43] SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training Region-Based Object Detectors with Online Hard Example Mining // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 761-769. [44] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal Loss for Dense Object Detection // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2999-3007. [45] LI B Y, LIU Y, WANG X G. Gradient Harmonized Single-Stage Detector. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 8577-8584. |
|
|
|