Abstract:Due to the lack of precise bounding box annotations, weakly supervised object detectors rely on the pretrained image classification model to classify candidate regions. However, the pretrained model often produces high responses for discriminative regions rather than complete objects, resulting in the problems of part domination, instance missing and untight boxes. To address these issues, a multi-level fusion based weakly supervised object detection network is proposed. The detection performance is improved from the perspectives of enhancing the weak discriminative spatial feature learning, enriching intra-class sample features and weighting reliable pseudo-labels. Firstly, a power function is utilized to weight and fuse the activation values within the neighborhood by the power pooling layer to reduce information loss of weak discriminative features. Secondly, the feature vectors of candidate regions are randomly fused by the feature mixing method to enrich the diversity of training sample features. Finally, the confidence of predictions and pseudo-labels is fused via the confidence-based sample re-weighting strategy to adjust the influence of pseudo-labels on training. Experiments on three benchmarks demonstrate the superiority of the proposed network.
[1] RAVINDRAN R, SANTORA M J, JAMALI M M. Multi-object Detection and Tracking, Based on DNN, for Autonomous Vehicles: A Review. IEEE Sensors Journal, 2020, 21(5): 5668-5677. [2] NGUYEN E H, YANG H C, DENG R N, et al. Circle Representation for Medical Object Detection. IEEE Transactions on Medical Imaging, 2022, 41(3): 746-754. [3] JIAO L C, ZHANG R H, LIU F, et al. New Generation Deep Learning for Video Object Detection: A Survey. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(8): 3195-3215. [4] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 2980-2988. [5] LAW H, DENG J. CornerNet: Detecting Objects as Paired Keypoints. International Journal of Computer Vision, 2020, 128: 642-656. [6] WU Z H, LIU C L, HUANG C, et al. Deep Object Detection with Example Attribute Based Prediction Modulation // Proc of the IEEE International Conference on Acoustics, Speech and Signal Processing. Washington, USA: IEEE, 2022: 2020-2024. [7] HUANG Z Y, ZOU Y, BHAGAVATULA V, et al. Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection // Proc of the 34th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2020: 16797-16807. [8] SHEN Y H, JI R R, WANG Y, et al. Enabling Deep Residual Networks for Weakly Supervised Object Detection // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 118-136. [9] ZHANG D W, ZENG W Y, YAO J R, et al. Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 3349-3363. [10] TANG P, WANG X G, BAI S, et al. PCL: Proposal Cluster Lear-ning for Weakly Supervised Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(1): 176-191. [11] WU Z H, WEN J, XU Y, et al. Enhanced Spatial Feature Learning for Weakly Supervised Object Detection. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(1): 961-972. [12] CHEN Z, FU Z H, JIANG R X, et al. SLV: Spatial Likelihood Voting for Weakly Supervised Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 12992-13001. [13] GAO Y, LIU B X, GUO N, et al. C-MIDN: Coupled Multiple Instance Detection Network with Segmentation Guidance for Weakly Supervised Object Detection // Proc of the IEEE/CVF Internatio-nal Conference on Computer Vision. Washington, USA: IEEE, 2019: 9833-9842. [14] LIN C H, WANG S W, XU D Q, et al. Object Instance Mining for Weakly Supervised Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11482-11489. [15] REN Z Z, YU Z D, YANG X D, et al. Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 10595-10604. [16] WU Z H, WEN J, XU Y, et al. Multiple Instance Detection Networks with Adaptive Instance Refinement. IEEE Transactions on Multimedia, 2023, 25: 267-279. [17] CHENG G, YANG J Y, GAO D C, et al. High-Quality Proposals for Weakly Supervised Object Detection. IEEE Transactions on Image Processing, 2020, 29: 5794-5804. [18] WU Z H, LIU C L, WEN J, et al. Selecting High-Quality Propo-sals for Weakly Supervised Object Detection with Bottom-Up Aggregated Attention and Phase-Aware Loss. IEEE Transactions on Image Processing, 2023, 32: 682-693. [19] TANG P, WANG X G, BAI X, et al. Multiple Instance Detection Network with Online Instance Classifier Refinement // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 3059-3067. [20] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes(VOC) Challenge. International Journal of Computer Vision, 2010, 88: 303-338. [21] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common Objects in Context // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2014: 740-755. [22] UIJLINGS J R R, VAN DE SANDE K E, GEVERS T, et al. Selective Search for Object Recognition. International Journal of Computer Vision, 2013, 104: 154-171. [23] YIN Y F, DENG J J, ZHOU W G, et al. Instance Mining with Class Feature Banks for Weakly Supervised Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3190-3198. [24] DENG J, DONG W, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2009: 248-255. [25] PONT-TUSET J, ARBELÁEZ P, BARRON J T, et al. Multiscale Combinatorial Grouping for Image Segmentation and Object Propo-sal Generation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(1): 128-140. [26] WEI Y C, SHEN Z Q, CHENG B W, et al. TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 454-470. [27] TANG P, WANG X G, WANG A T, et al. Weakly Supervised Region Proposal Network and Object Detection // Proc of the Euro-pean Conference on Computer Vision. Berlin, Germany: Springer, 2018: 370-388. [28] WAN F, WEI P X, HAN Z J, et al. Min-Entropy Latent Model for Weakly Supervised Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(10): 2395-2409. [29] SHEN Y H, JI R R, YANG K Y, et al. Category-Aware Spatial Constraint for Weakly Supervised Detection. IEEE Transactions on Image Processing, 2019, 29: 843-858. [30] WAN F, LIU C, KE W, et al. C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2194-2203. [31] SHEN Y H, JI R R, WANG Y, et al. Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 697-707. [32] LI X Y, KAN M N, SHAN S G, et al. Weakly Supervised Object Detection with Segmentation Collaboration // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 9734-9743. [33] ARUN A, JAWAHAR C V, KUMAR M P. Dissimilarity Coefficient Based Weakly Supervised Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 9424-9433. [34] ZENG Z Y, LIU B, FU J L, et al. WSOD2: Learning Bottom-Up and Top-Down Objectness Distillation for Weakly-Supervised Object Detection // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 8291-8299. [35] GAO W, WAN F, YUE J, et al. Discrepant Multiple Instance Learning for Weakly Supervised Object Detection. Pattern Recognition, 2022, 122. DOI: 10.1016/j.patcog.2021.108233.