[1] WU Y H, LIU Y, ZHAN X, et al.P2T: Pyramid Pooling Transformer for Scene Understanding. IEEE Transactions on Pattern Ana-lysis and Machine Intelligence, 2023, 45(11): 12760-12771.
[2] FAN D P, ZHANG J, XU G, et al.Salient Objects in Clutter. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2344-2366.
[3] LI S Y, LIU H B, QIAN R, et al.TA2N: Two-Stage Action Alignment Network for Few-Shot Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 1404-1411.
[4] GUO W Y, ZHANG Y, YANG J F, et al.Re-Attention for Visual Question Answering. IEEE Transactions on Image Processing, 2021, 30: 6730-6743.
[5] WANG T, XU N, CHEN K A, et al.End-to-End Video Instance Segmentation via Spatial-Temporal Graph Neural Networks // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 10777-10786.
[6] LIU Y, GU Y C, ZHANG X Y, et al.Lightweight Salient Object Detection via Hierarchical Visual Perception Learning. IEEE Transactions on Cybernetics, 2021, 51(9): 4439-4449.
[7] FENG T L, LIU J X, YANG J F.Probing Sentiment-Oriented Pre-Training Inspired by Human Sentiment Perception Mechanism // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 2850-2860.
[8] LECUN Y, BENGIO Y, HINTON G. Deep Learning. Nature, 2015, 521(7553): 436-444.
[9] HUANG H Z, WANG Y, HU Q H, et al.Class Specific Semantic Reconstruction for Open Set Recognition. IEEE Transactions on Pa-ttern Analysis and Machine Intelligence, 2023, 45(4): 4214-4228.
[10] GAN R T, FAN J S, WANG Y X, et al.Interact with Open Scenes: A Life-Long Evolution Framework for Interactive Segmentation Models // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 5688-5697.
[11] ZHU F, CHENG Z, ZHANG X Y, et al.OpenMix: Exploring Outlier Samples for Misclassification Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 12074-12083.
[12] LI J C, XIE C Y, WU X Y, et al. What Makes Good Open-Vocabulary Detector: A Disassembling Perspective[C/OL].[2023-09-20]. https://arxiv.org/pdf/2309.00227.pdf.
[13] MAO B J, ZHANG X B, WANG L F, et al.Learning from the Target: Dual Prototype Network for Few Shot Semantic Segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 1953-1961.
[14] LI X C, XIA X B, ZHU F, et al.Dynamics-Aware Loss for Lear-ning with Label Noise. Pattern Recognition, 2023, 144. DOI: 10.1016/j.patcog.2023.109835.
[15] CHENG Z, ZHU F, ZHANG X Y, et al.Adversarial Training with Distribution Normalization and Margin Balance. Pattern Recognition, 2023, 136. DOI: 10.1016/j.patcog.2022.109182.
[16] ZHANG Y, TI?O P, LEONARDIS A, et al. A Survey on Neural Network Interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence, 2021, 5(5): 726-742.
[17] WU Y H, GAO S H, MEI J, et al.JCS: An Explainable COVID-19 Diagnosis System by Joint Classification and Segmentation. IEEE Transactions on Image Processing, 2021, 30: 3113-3126.
[18] TANG K H, ZHANG H W, WU B Y, et al.Learning to Compose Dynamic Tree Structures for Visual Contexts // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 6612-6621.
[19] FAN J S, ZHANG Z X.Memory-Based Cross-Image Contexts for Weakly Supervised Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(5): 6006-6020.
[20] HUANG Y, WANG Y M, ZENG Y N, et al. MACK: Multimodal Aligned Conceptual Knowledge for Unpaired Image-Text Matching[C/OL].[2023-09-20]. https://papers.nips.cc/paper_files/paper/2022/file/3379ce104189b72d5f7baaa03ae81329-Paper-Conference.pdf.
[21] TIAN K, ZHANG C H, WANG Y, et al.Knowledge Mining and Transferring for Domain Adaptive Object Detection // Proc of the IEEE/CVF International Conference on Computer Vision. Wa-shington, USA: IEEE, 2021: 9113-9122.
[22] YU H Y, LI T, YU W C, et al.Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting // Proc of the 31st International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2022: 2362-2368.
[23] FUKUSHIMA K.Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position-Neocognitron. IEICE Technical Report A, 1979, 62(10): 658-665.
[24] LOWE D G.Object Recognition from Local Scale-Invariant Features // Proc of the 7th IEEE International Conference on Computer Vision. Washington, USA: IEEE, 1999. DOI: 10.1109/ICCV.1999.790410
[25] LOWE D G.Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110.
[26] DETONE D, MALISIEWICZ T, RABINOVICH A.SuperPoint: Self-Supervised Interest Point Detection and Description // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, USA: IEEE, 2018: 337-349.
[27] LIU Y, SHEN Z H, LIN Z X, et al. GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs[C/OL].[2023-09-20]. https://arxiv.org/pdf/1911.05932.pdf.
[28] LIN W Y, HE X Y, DAI W R, et al.Key-Point Sequence Lossless Compression for Intelligent Video Analysis. IEEE MultiMedia, 2020, 27(3): 12-22.
[29] DUDA R O, HART P E.Use of Hough Transformation to Detect Lines and Curves in Pictures. Communications of the ACM, 1972, 15(1): 11-15.
[30] ZHANG Z H, LI Z X, BI N, et al.PPGNet: Learning Point-Pair Graph for Line Segment Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 7098-7107.
[31] XUE N, BAI S, WANG F D, et al.Learning Attraction Field Map for Robust Line Segment Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 1595-1603.
[32] LEE J T, KIM H U, LEE C, et al.Semantic Line Detection and its Applications // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 3249-3257.
[33] HAN Q, ZHAO K, XU J, et al.Deep Hough Transform for Semantic Line Detection // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 249-265.
[34] ZHAO K, HAN Q, ZHANG C B, et al.Deep Hough Transform for Semantic Line Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 4793-4806.
[35] KANOPOULOS N, VASANTHAVADA N, BAKER R L.Design of an Image Edge Detection Filter Using the Sobel Operator. IEEE Journal of Solid-State Circuits, 1988, 23(2): 358-367.
[36] SORIA X, RIBA E, SAPPA A.Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection // Proc of the IEEE Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2020: 1912-1921.
[37] LIU C, YANG J M, CEYLAN D, et al.PlaneNet: Piece-Wise Planar Reconstruction From a Single RGB Image // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 2579-2588.
[38] LIU C, KIM K, GU J W, et al.PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 4445-4454.
[39] WANG T, LING H B.Gracker: A Graph-Based Planar Object Tracker. IEEE Transactions on Pattern Analysis and Machine Inte-lligence, 2017, 40(6): 1494-1501.
[40] ZHANG Z C, LIU S Z, YANG J F.Multiple Planar Object Trac-king // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2023: 23460-23470.
[41] ZHANG Z C, CHEN S, WANG Z C, et al.PlaneSeg: Building a Plug-In for Boosting Planar Region Segmentation. IEEE Transactions on Neural Networks and Learning Systems, 2023. DOI: 10.1109/TNNLS.2023.3262544
[42] LIU J J, LIU Z A, PENG P, et al.Rethinking the U-Shape Structure for Salient Object Detection. IEEE Transactions on Image Processing, 2021, 30: 9030-9042.
[43] WU Y H, LIU Y, ZHANG L, et al.EDN: Salient Object Detection via Extremely-Downsampled Network. IEEE Transactions on Image Processing, 2022, 31: 3125-3136.
[44] WU Y H, LIU Y, ZHANG L, et al.Regularized Densely-Connec-ted Pyramid Network for Salient Instance Segmentation. IEEE Tran-sactions on Image Processing, 2021, 30: 3897-3907.
[45] LIU J J, HOU Q B, LIU Z A, et al.PoolNet+: Exploring the Potential of Pooling for Salient Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 887-904.
[46] FAN D P, CHENG M M, LIU Y, et al.Structure-Measure: A New Way to Evaluate Foreground Maps// Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 4558-4567.
[47] ZHOU T, FAN D P, CHENG M M, et al.RGB-D Salient Object Detection: A Survey. Computational Visual Media, 2021, 7: 37-69.
[48] FAN D P, ZHAI Y J, BORJI A, et al.BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network // Proc of the 16th European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 275-292.
[49] ZHAI Y J, FAN D P, YANG J F, et al.Bifurcated Backbone Strategy for RGB-D Salient Object Detection. IEEE Transactions on Image Processing, 2021, 30: 8727-8742.
[50] WU Y H, LIU Y, XU J, et al.MobileSal: Extremely Efficient RGB-D Salient Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 10261-10269.
[51] GAO S H, TAN Y Q, CHENG M M, et al.Highly Efficient Salient Object Detection with 100K Parameters // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 702-721.
[52] CHENG M M, GAO S H, BORJI A, et al.A Highly Efficient Model to Study the Semantics of Salient Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(11): 8006-8021.
[53] MINSKY M.The Society of Mind. New York, USA: Simon & Schuster, 1988.
[54] LIU S Z, ZHANG X, YANG J F.SER30K: A Large-Scale Dataset for Sticker Emotion Recognition // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 33-41.
[55] ZHAO S J, GE Y X, QI Z A, et al. Sticker820K: Empowering Interactive Retrieval with Stickers[C/OL].[2023-09-20]. https://arxiv.org/pdf/2306.06870.pdf.
[56] WANG L J, GUO W Y, YAO X X, et al.Multimodal Event-Aware Network for Sentiment Analysis in Tourism. IEEE MultiMedia, 2021, 28(2): 49-58.
[57] WEN C S, JIA G L, YANG J F.DIP: Dual Incongruity Perceiving Network for Sarcasm Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 2540-2550.
[58] ESTRADA M L B, CABADA R Z, BUSTILLOS R O, et al. Opi-nion Mining and Emotion Recognition Applied to Learning Environments. Expert Systems with Applications, 2020, 150. DOI: 10.1016/j.eswa.2020.113265.
[59] BORTH D, CHEN T, JI R T, et al.SentiBank: Large-Scale Ontology and Classifiers for Detecting Sentiment and Emotions in Visual Content // Proc of the 21st ACM International Conference on Multimedia. New York, USA: ACM, 2013: 459-460.
[60] SUN M, YANG J F, WANG K, et al.Discovering Affective Regions in Deep Convolutional Neural Networks for Visual Sentiment Prediction // Proc of the IEEE International Conference on Multi-media and Expo. Washington, USA: IEEE, 2016. DOI: 10.1109/ICME.2016.7552961.
[61] SHE D Y, YANG J F, CHENG M M, et al.WSCNet: Weakly Supervised Coupled Networks for Visual Sentiment Classification and Detection. IEEE Transactions on Multimedia, 2020, 22(5): 1358-1371.
[62] YANG Y, JIA J, ZHANG S M, et al.How Do Your Friends on Social Media Disclose Your Emotions? Proceedings of the AAAI Conference on Artificial Intelligence, 2014, 28(1): 306-312.
[63] YANG J Y, LI J, LI L D, et al.Seeking Subjectivity in Visual Emo-tion Distribution Learning. IEEE Transactions on Image Processing, 2022, 31: 5189-5202.
[64] WANG L J, ZHANG X, JIANG N, et al.D2S: Dynamic Distribution Supervision for Multi-label Facial Expression Recognition // Proc of the IEEE International Conference on Multimedia and Expo. Washington, USA: IEEE, 2022. DOI: 10.1109/ICME52920.2022.9859687.
[65] YANG J F, SUN M, SUN X X.Learning Visual Sentiment Distributions via Augmented Conditional Probability Neural Network. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 224-230.
[66] YANG J F, SHE D Y, SUN M.Joint Image Emotion Classification and Distribution Learning via Deep Convolutional Neural Network // Proc of the 26th International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2017: 3266-3272.
[67] YANG J F, SHE D Y, LAI Y K, et al.Retrieving and Classifying Affective Images via Deep Metric Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 32(1): 491-498.
[68] YAO X X, SHE D Y, ZHAO S C, et al.Attention-Aware Polarity Sensitive Embedding for Affective Image Retrieval // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2019: 1140-1150.
[69] JIA G L, YANG J F.S2-VER: Semi-Supervised Visual Emotion Recognition // Proc of the 17th European Conference on Computer Vision. Berlin, Germany: Springer, 2022: 493-509.
[70] YOU Q Z, LUO J B, JIN H L, et al.Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks. Proceedings of the AAAI conference on Artificial Intelligence, 2015, 29(1): 381-388.
[71] PAN J C, WANG S F, FANG L.Representation Learning through Multimodal Attention and Time-Sync Comments for Affective Video Content Analysis // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 42-50.
[72] ZHAO S C, JIA G L, YANG J F, et al.Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies. IEEE Signal Processing Magazine, 2021, 38(6): 59-73.
[73] ZHAO S C, MA Y S, GU Y, et al.An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(1): 303-311.
[74] ZHANG Z C, YANG J F.Temporal Sentiment Localization: Listen and Look in Untrimmed Videos // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 199-208.
[75] ZHANG Z C, WANG L J, YANG J F.Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2023: 18888-18897.
[76] LI P, YANG Y, ZHAO W D, et al.Evaluation of Image Fire Detection Algorithms Based on Image Complexity. Fire Safety Journal, 2021, 121. DOI: 10.1016/j.firesaf.2021.103306.
[77] DAI L C, ZHANG K, ZHENG X S, et al.Visual Complexity of Shapes: A Hierarchical Perceptual Learning Model. The Visual Computer, 2022, 38: 419-432.
[78] OLIVIA A, MACK M L, SHRESTHA M, et al.Identifying the Perceptual Dimensions of Visual Complexity of Scenes // Proc of the Annual Meeting of the Cognitive Science Society. New York, USA: ACM, 2004: 1041-1046.
[79] CHEN Y Q, DUAN J, ZHU Y, et al.Research on the Image Complexity Based on Neural Network // Proc of the International Conference on Machine Learning and Cybernetics. Washington, USA: IEEE, 2015: 295-300.
[80] SARAEE E, JALAL M, BETKE M.Visual Complexity Analysis Using Deep Intermediate-Layer Features. Computer Vision and Image Understanding, 2020, 195. DOI: 10.1016/j.cviu.2020.102949.
[81] FENG T L, ZHAI Y J, YANG J F, et al.IC9600: A Benchmark Dataset for Automatic Image Complexity Assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(7): 8577-8593.
[82] KHOSLA A, RAJU A S, TORRALBA A, et al.Understanding and Predicting Image Memorability at a Large Scale // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2015: 2390-2398.
[83] SHOKRI R, STRONATI M, SONG C Z, et al.Membership Infe-rence Attacks against Machine Learning Models // Proc of the IEEE Symposium on Security and Privacy. Washington, USA: IEEE, 2017: 3-18.
[84] ARPIT D, JASTRZ?BSKI S, BALLAS N, et al. A Closer Look at Memorization in Deep Networks // Proc of the 34th International Conference on Machine Learning. San Diego, USA: JMLR, 2017: 233-242.
[85] WANG Z, BOVIK A C, SHEIKH H R, et al.Image Quality Asse-ssment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.
[86] GIROD B.What's Wrong with Mean Squared Error // WASTON A B, ed. Digital Images and Human Vision. Cambridge, USA: MIT Press, 1993: 207-220.
[87] ZHANG R, ISOLA P, EFROS A A, et al.The Unreasonable Effe-ctiveness of Deep Features as a Perceptual Metric // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 586-595.
[88] DING K Y, MA K D, WANG S Q, et al.Image Quality Assessment: Unifying Structure and Texture Similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(5): 2567-2581.
[89] ROY S, MITRA S, BISWAS S, et al.Test Time Adaptation for Blind Image Quality Assessment // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2023: 16742-16751.
[90] LIU X L, VAN DE WEIJER J, BAGDANOV A D. RankIQA: Learning from Rankings for No-Reference Image Quality Assessment // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 1040-1049.
[91] SU S L, YAN Q S, ZHU Y, et al.Blindly Assess Image Quality in the Wild Guided by a Self-Adaptive Hyper Network // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2020: 3664-3673.
[92] ZHANG W X, LI D Q, MA C, et al.Continual Learning for Blind Image Quality Assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 2864-2878.
[93] KANG L, YE P, LI Y, et al.Convolutional Neural Networks for No-Reference Image Quality Assessment // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2014: 1733-1740.
[94] PAN D, SHI P, HOU M, et al.Blind Predicting Similar Quality Map for Image Quality Assessment // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2018: 6373-6382.
[95] MURRAY N, MARCHESOTTI L, PERRONNIN F.AVA: A Large-Scale Database for Aesthetic Visual Analysis // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2012: 2408-2415.
[96] ZHANG X D, GAO X B, LU W, et al.A Gated Peripheral-Foveal Convolutional Neural Network for Unified Image Aesthetic Prediction. IEEE Transactions on Multimedia, 2019, 21(11): 2815-2826.
[97] ZHUANG B H, LIU L Q, LI Y, et al.Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning from Web Data // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 2915-2924.
[98] NAYAK G, GHOSH R, JIA X W, et al. Weakly Supervised Cla-ssification Using Group-Level Labels[C/OL].[2023-09-20]. https://arxiv.org/pdf/2108.07330v1.pdf.
[99] XU Y H, QIAN Q, LI H, et al.Weakly Supervised Representation Learning with Coarse Labels // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 10573-10581.
[100] JIANG B, WANG L L, CHENG J, et al.GPENs: Graph Data Learning with Graph Propagation-Embedding Networks. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 3925-3938.
[101] ZHENG Z H, YE R G, WANG P, et al.Localization Distillation for Dense Object Detection // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 9397-9406.
[102] ZHANG S Q, LI C L, JIA Z, et al.DiagIoU Loss for Object Detection. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(12): 7671-7683.
[103] CHEN Z M, CHEN K, LIN W Y, et al.PIoU Loss: Towards Accu-rate Oriented Object Detection in Complex Environments // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 195-211.
[104] ZHANG D W, HAN J W, CHENG G, et al.Weakly Supervised Object Localization and Detection: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(9): 5866-5885.
[105] SHAO F F, CHEN L, SHAO J, et al.Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey. Neurocomputing, 2022, 496: 192-207.
[106] ZHANG Y M, CHEN T.Weakly Supervised Object Recognition and Localization with Invariant High Order Features[C/OL]. [2023-09-20].https://bmvc10.dcs.aber.ac.uk/proc/conference/paper47/paper47.pdf.
[107] TANG Y X, WANG X F, DELLANDREA E, et al.Fusing Generic Objectness and Deformable Part-Based Models for Weakly Supervised Object Detection // Proc of the IEEE International Conference on Image Processing. Washington, USA: IEEE, 2014: 4072-4076.
[108] SIVA P, RUSSELL C, XIANG T, et al.Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2013: 3238-3245.
[109] SHI Z Y, HOSPEDALES T M, XIANG T.Bayesian Joint Mode-lling for Object Localisation in Weakly Labelled Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(10): 1959-1972.
[110] CINBIS R G, VERBEEK J, SCHMID C.Multi-fold MIL Training for Weakly Supervised Object Localization // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Wa-shington, USA: IEEE, 2014: 2409-2416.
[111] DESELAERS T, ALEXE B, FERRARI V.Localizing Objects While Learning Their Appearance // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2010: 452-466.
[112] SINGH K K, XIAO F Y, LEE Y J.Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3548-3556.
[113] LI D, HUANG J B, LI Y L, et al.Weakly Supervised Object Localization with Progressive Domain Adaptation // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 3512-3520.
[114] SHI M J, CAESAR H, FERRARI V.Weakly Supervised Object Localization Using Things and Stuff Transfer // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 3401-3410.
[115] ZHANG D W, HAN J W, ZHAO L, et al.Leveraging Prior-Knowledge for Weakly Supervised Object Detection under a Collaborative Self-Paced Curriculum Learning Framework. International Journal of Computer Vision, 2019, 127: 363-380.
[116] SANG H B, NI Z L, HE H Y, et al.Trace-Level Invisible Enhanced Network for 6D Pose Estimation // Proc of the IEEE International Conference on Multimedia and Expo. Washington, USA: IEEE, 2022. DOI: 10.1109/ICME52920.2022.9859613.
[117] JIANG P T, HAN L H, HOU Q B, et al.Online Attention Accumulation for Weakly Supervised Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(10): 7062-7077.
[118] LIU Y, WU Y H, WEN P S, et al.Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1415-1428.
[119] LIN Z, DUAN Z P, ZHANG Z, et al.KnifeCut: Refining Thin Part Segmentation with Cutting Lines // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 809-817.
[120] 侯淇彬,韩凌昊,刘姜江,等.互联网图像驱动的语义分割自主学习.中国科学(信息科学), 2021, 51(7): 1084-1099.
(HOU Q B, HAN L H, LIU J J, et al.Autonomous Learning of Semantic Segmentation from Internet Images. Scientia Sinica Informationis, 2021, 51(7): 1084-1099.)
[121] MEI J, CHENG M M, XU G, et al.SANet: A Slice-Aware Network for Pulmonary Nodule Detection. IEEE Transactions on Pa-ttern Analysis and Machine Intelligence, 2022, 44(8): 4374-4387.
[122] CHEN J, LI Z H, LUO J B, et al.Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection // Proc of the IEEE/CVF Conference on Computer Vision and Pa-ttern Recognition. Washington, USA: IEEE, 2020: 9898-9908.
[123] LIU Q, RAMANATHAN V, MAHAJAN D, et al.Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2021: 13963-13973.
[124] LI Y X, XU N, YANG W J, et al.Exploring the Semi-Supervised Video Object Segmentation Problem from a Cyclic Perspective. International Journal of Computer Vision, 2022, 130(10): 2408-2424.
[125] WANG X L, JABRI A, EFROS A A.Learning Correspondence from the Cycle-Consistency of Time // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2019: 2566-2576.
[126] LI X T, LIU S F, DE MELLO S, et al.Joint-Task Self-Supervised Learning for Temporal Correspondence // Proc of the 33rd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2019: 318-328.
[127] YAN L Q, WANG Q F, MA S Q, et al.Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework with Spatio-Temporal Collaboration. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(1): 393-406.
[128] LIN F C, XIE H T, LIU C B, et al.Bilateral Temporal Re-Aggregation for Weakly-Supervised Video Object Segmentation. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4498-4512.
[129] LIU P D, HE Z B, YAN X Y, et al.WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations // Proc of the 29th ACM International Conference on Multimedia. New York, USA: ACM, 2021: 2995-3004.
[130] ZHANG Z, JIN W D, XU J, et al.Gradient-Induced Co-Saliency Detection // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 455-472.
[131] LI Y X, LIN W Y, WANG T, et al.Video Summarization via Cluster-Based Object Tracking and Type-Based Synopsis // Proc of the IEEE Conference on Multimedia Information Processing and Retrieval. Washington, USA: IEEE, 2020: 113-116.
[132] LOCATELLO F, WEISSENBORN D, UNTERTHINER T, et al. Object-Centric Learning with Slot Attention[C/OL].[2023-09-20]. https://arxiv.org/pdf/2006.15055.pdf.
[133] VO V H, SIZIKOVA E, SCHMID C, et al. Large-Scale Unsupervised Object Discovery[C/OL].[2023-09-20]. https://arxiv.org/abs/2106.06650.
[134] GAO S H, LI Z Y, YANG M H, et al.Large-Scale Unsupervised Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 7457-7476.
[135] YAO X X, ZHAO S C, XU P F, et al.Multi-source Domain Ada-ptation for Object Detection // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 3253-3262.
[136] QIAN R, LI Y X, LIU H B, et al.Enhancing Self-Supervised Video Representation Learning via Multi-level Feature Optimization // Proc of the IEEE/CVF International Conference on Computer Vision. Washington, USA: IEEE, 2021: 7970-7981.
[137] CHEN S, XUE J H, CHANG J L, et al.SSL++: Improving Self-supervised Learning by Mitigating the Proxy Task-Specificity Problem. IEEE Transactions on Image Processing, 2021, 31: 1134-1148.
[138] YAO X X, ZHAO S C, LAI Y K, et al.APSE: Attention-Aware Polarity-Sensitive Embedding for Emotion-Based Image Retrieval. IEEE Transactions on Multimedia, 2020, 23: 4469-4482.
[139] VAN DEN OORD A, KALCHBRENNER N, ESPEHOLT L, et al. Conditional Image Generation with PixelCNN Decoders // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2016: 4797-4805.
[140] VAN DEN OORD A, KALCHBRENNER N, KAVUKCUOGLU K. Pixel Recurrent Neural Networks // Proc of the 33rd International Conference on Machine Learning. San Diego, USA: JMLR, 2016: 1747-1756.
[141] KINGMA D P, DHARIWAL P.Glow: Generative Flow with Invertible 1×1 Convolutions // Proc of the 32nd International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2016: 10236-10245.
[142] DINH L, SOHL-DICKSTEIN J, BENGIO S.Density Estimation Using Real NVP[C/OL]. [2023-09-20].https://arxiv.org/pdf/1605.08803.pdf.
[143] VAN DEN OORD A, VINYALS O, KAVUKCUOGLU K, et al. Neural Discrete Representation Learning // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 6309-6318.
[144] HE K M, CHEN X L, XIE S N, et al.Masked Autoencoders Are Scalable Vision Learners // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 15979-15988.
[145] DEVLIN J, CHANG M W, LEE K, et al.BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(Long and Short Papers). Stroudsburg, USA: ACL, 2019: 4171-4186.
[146] WEI C, FAN H Q, XIE S N, et al.Masked Feature Prediction for Self-Supervised Visual Pre-Training // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Wa-shington, USA: IEEE, 2022: 14648-14658.
[147] LI S Y, WU D, WU F, et al.Architecture-Agnostic Masked Image Modeling-From ViT Back to CNN // Proc of the 40th International Conference on Machine Learning. San Diego, USA: JMLR, 2023: 20149-20167. [148] HOU Z J, SUN F, CHEN Y K, et al. MILAN: Masked Image Pretraining on Language Assisted Representation[C/OL]. [2023-09-20]. https://arxiv.org/pdf/2208.06049.pdf.
[149] ZENG D L, LIAO M Y, TAVAKOLIAN M, et al. Deep Learning for Scene Classification: A Survey[C/OL]. [2023-09-20]. https://arxiv.org/abs/2101.10531.
[150] SIAGIAN C, ITTI L. Rapid Biologically-Inspired Scene Classification Using Features Shared with Visual Attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(2): 300-312.
[151] REN Z, QIAN K, ZHANG Z X, et al. Deep Scalogram Representations for Acoustic Scene Classification. IEEE/CAA Journal of Automatica Sinica, 2018, 5(3): 662-669.
[152] WANG L, SNG D. Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey[C/OL]. [2023-09-20]. https://arxiv.org/abs/1512.03131.
[153] L??PEZ-CIFUENTES A, ESCUDERO-VINOLO M, BESC??S J, et al. Semantic-Aware Scene Recognition. Pattern Recognition, 2020, 102. DOI: 10.1016/j.patcog.2020.107256
[154] TONG Z H, SHI D X, YAN B Z, et al. A Review of Indoor-Outdoor Scene Classification // Proc of the 2nd International Conference on Control, Automation and Artificial Intelligence. New York, USA: ACM, 2017: 469-474.
[155] CHENG G, HAN J W, LU X Q. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the IEEE, 2017, 105(10): 1865-1883.
[156] XIA G S, HU J W, HU F, et al. AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965-3981.
[157] MESAROS A, HEITTOLA T, VIRTANEN T. TUT Database for Acoustic Scene Classification and Sound Event Detection // Proc of the 24th European Signal Processing Conference. Washington, USA: IEEE, 2016: 1128-1132.
[158] LOWRY S, SÜNDERHAUF N, NEWMAN P, et al. Visual Place Recognition: A Survey. IEEE Transactions on Robotics, 2016, 32(1): 1-19.
[159] ARANDJELOVIC R, GRONAT P, TORII A, et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6): 1437-1451.
[160] BROWN M, SÜSSTRUNK S. Multispectral SIFT for Scene Category Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2011: 177-184.
[161] VISWANATHAN D G. Features from Accelerated Segment Test (FAST) [C/OL]. [2023-09-20]. https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV1011/AV1FeaturefromAcceleratedSegmentTest.pdf.
[162] BAY H, ESS A, TUYTELAARS T, et al. Speeded-Up Robust Features(SURF). Computer Vision and Image Understanding, 2008, 110(3): 346-359.
[163] JEEVAN P P, VISWANATHAN K, ANANDU A S, et al. Wave-Mix: A Resource-Efficient Neural Network for Image Analysis[C/OL]. [2023-09-20]. https://arxiv.org/abs/2205.14375.
[164] WANG Q L, XIE J T, ZUO W M, et al. Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(8): 2582-2597.
[165] 张羽丰,李昱希,赵明璧,等.局部双目视差回归的目标距离估计.中国图象图形学报, 2021, 26(7): 1604-1613.
(ZHANG Y F, LI Y X, ZHAO M B, et al. Object Distance Estimation Based on Stereo Regional Disparity Regression. Journal of Image and Graphics, 2021, 26(7): 1604-1613.)
[166] ZHANG Y F, LI Y X, ZHAO M B, et al. A Regional Regression Network for Monocular Object Distance Estimation // Proc of the IEEE International Conference on Multimedia and Expo Workshops. Washington, USA: IEEE, 2020. DOI: 10.1109/ICMEW46912.2020.9106012.
[167] LIU H B, LI J G, LI D, et al. Learning Scale-Consistent Attention Part Network for Fine-Grained Image Recognition. IEEE Transactions on Multimedia, 2021, 24: 2902-2913.
[168] FAN J H, LIU H B, YANG W J, et al. Speed Up Object Detection on Gigapixel-Level Images With Patch Arrangement // Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2022: 4643-4651.
[169] GUO M H, LU C Z, HOU Q B, et al. SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation[C/OL]. [2023-09-20]. https://arxiv.org/pdf/2209.08575v1.pdf.
[170] MEI J, LI R J, GAO W, et al. CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery. IEEE Transactions on Image Processing, 2021, 30: 8540-8552.
[171] PRANGEMEIER T, REICH C, KOEPPL H. Attention-Based Trans-formers for Instance Segmentation of Cells in Microstructures // Proc of the IEEE International Conference on Bioinformatics and Biomedicine. Washington, USA: IEEE, 2020: 700-707.
[172] LI S Y, LIU H B, FEI M J, et al. Temporal Alignment via Event Boundary for Few-shot Action Recognition[C/OL]. [2023-09-20]. https://www.bmvc2021-virtualconference.com/assets/papers/0878.pdf.
[173] LIU H B, LÜ W U X, SEE J, et al. Task-adaptive Spatial-Temporal Video Sampler for Few-Shot Action Recognition // Proc of the 30th ACM International Conference on Multimedia. New York, USA: ACM, 2022: 6230-6240.
[174] LI Y X, LIN W Y, SEE J, et al. CFAD: Coarse-to-Fine Action Detector for Spatiotemporal Action Localization // Proc of the European Conference on Computer Vision. Berlin, Germany: Sprin-ger, 2020: 510-527.
[175] LI Y X, ZHANG B S, LI J, et al. LSTC: Boosting Atomic Action Detection with Long-Short-Term Context // Proc of the 29th ACM International Conference on Multimedia. New York, USA: ACM, 2021: 2158-2166.
[176] QIAN R, HU D, DINKEL H, et al. Multiple Sound Sources Localization from Coarse to Fine // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 292-308.
[177] LI C L, LIU L, LU A D, et al. Challenge-Aware RGBT Tracking // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 222-237.
[178] CHANG X J, REN P Z, XU P F, et al. A Comprehensive Survey of Scene Graphs: Generation and Application. IEEE Transactions on Neural Networks and Learning Systems, 2023, 45(1): 1-26.
[179] ZAREIAN A, KARAMAN S, CHANG S F. Bridging Knowledge Graphs to Generate Scene Graphs // Proc of the European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2020: 606-623.
[180] LI H S, ZHU G M, ZHANG L, et al. Scene Graph Generation: A Comprehensive Survey. Neurocomputing, 2023. DOI: 10.1016/j.neucom.2023.127052. [181] LI Y K, OUYANG W L, ZHOU B L, et al. Factorizable Net: An Efficient Subgraph-Based Framework for Scene Graph Generation // Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 346- 363.
[182] LAFFERTY J, MCCALLUM A, PEREIRA F C N. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data // Proc of the 18th International Conference on Machine Learning. San Diego, USA: JMLR, 2001: 282-289.
[183] BORDES A, USUNIER N, GARCIA-DURAN A, et al. Translating Embeddings for Modeling Multi-relational Data // Proc of the 26th International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2013: 2787-2795.
[184] DAI B, ZHANG Y Q, LIN D H. Detecting Visual Relationships with Deep Relational Networks // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 3298-3308.
[185] CONG W L, WANG W, LEE W C. Scene Graph Generation via Conditional Random Fields[C/OL]. [2023-09-20].https://arxiv.org/pdf/1811.08075.pdf.
[186] ZHANG H W, KYAW Z, CHANG S F, et al. Visual Translation Embedding Network for Visual Relation Detection // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 3107-3115.
[187] HUNG Z S, MALLYA A, LAZEBNIK S. Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(11): 3820-3832.
[188] XU D F, ZHU Y K, CHOY C B, et al. Scene Graph Generation by Iterative Message Passing // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 3097-3106.
[189] PLUMMER B A, MALLYA A, CERVANTES C M, et al. Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues // Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 1946-1955. |