模式识别与人工智能
Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence
22 Judgement and Disposal of Academic Misconduct Article
22 Copyright Transfer Agreement
22 Proof of Confidentiality
22 Requirements for Electronic Version
More....
22 Chinese Association of Automation
22 National ResearchCenter for Intelligent Computing System
22 Institute of Intelligent Machines,Chinese Academy of Sciences
More....
 
 
2024 Vol.37 Issue.8, Published 2024-08-25

Papers and Reports    Researches and Applications    Detection and Recognition Algorithms in Realistic Scene   
   
Detection and Recognition Algorithms in Realistic Scene
663 Window Anchored Offset Constrained Dynamic Snake Convolutional Network for Aerial Small Target Detection
ZHANG Rongguo, QIN Zhen, HU Jing, WANG Lifang, LIU Xiaojun
To obtain the key and effective information from limited features of small targets and improve the localization ability and detection accuracy of small targets, a window anchored offset constrained dynamic snake convolutional network for aerial small target detection is proposed. Firstly, the offset constrained dynamic snake convolution is constructed. By dynamical offsetting in different directions, the constrained snake convolution kernel adaptively focuses on feature regions of different sizes and shapes, making feature extraction concentrate on tiny local structures and thereby facilitating the capture of small target features. Secondly, by employing two-stage multi-scale feature fusion method, feature alignment fusion and injection are performed on different layer-order feature maps to enhance the fusion of the underlying detail information and the high-level semantic information, and strengthen the transmission of target information of different sizes. Thus, the detection capability of the method for small targets is improved. Meanwhile, the window anchored bounding box regression loss function is designed. The function performs the bounding regression based on the auxiliary bounding box and the minimum point distance to achieve more accurate regression results and enhance the small target localization capability of the model. Finally, comparative experiments on three aerial photography datasets show that the proposed method makes the improvements with different degrees in small target detection performance.
2024 Vol. 37 (8): 663-677 [Abstract] ( 271 ) [HTML 1KB] [ PDF 6317KB] ( 237 )
678 Dynamic Supervised Camouflaged Object Detection with Semantic Reconstruction
JIANG Wentao, WANG Bohan
Camouflaged object detection(COD) aims to segment target objects that are visually highly integrated into their surrounding environments. However, a large number of similar interferences between the foreground and background of the object lead to significant segmentation errors in the process. To address this issue, dynamic supervised camouflaged object detection network with semantic reconstruction(DSSRNet) is proposed to achieve accurate segmentation of camouflaged objects by reconstructing the spatial semantics of the feature map and introducing confidence to guide network training. Firstly, a spatial semantic low-rank reconstruction mechanism is proposed to effectively perceive distinguishable semantic features of camouflaged objects at different scales. Secondly, the COD network is dynamically supervised by generating confidence prediction maps to minimize false positive and false negative judgments due to the overconfidence in the network. Finally, the blurred awareness loss function is employed to reduce the ambiguity of the prediction. Experiments on CAMO-Test, COD10K-Test and NC4K datasets demonstrate that DSSRNet provides better exclusion of interference and achieves more accurate segmentation results.
2024 Vol. 37 (8): 678-691 [Abstract] ( 121 ) [HTML 1KB] [ PDF 2940KB] ( 108 )
692 Lightweight Steel Surface Defect Detection Algorithm Based on Improved RetinaNet
WANG Weijia, ZHANG Yu, WANG Jinghua, XU Yong
For the requirement of the practical application, the existing defect detection algorithms suffer from the problems of slow detection speed and low detection accuracy. To address these issues, a lightweight steel surface defect detection algorithm based on improved RetinaNet is proposed. Firstly, the original backbone network is replaced by a lightweight network, and a cross-stage-partial structure is introduced to achieve effective propagation and lightweighting of gradients. Then, depth-separable convolution is employed to replace the traditional convolutional layer to further reduce the number of parameters and improve the detection speed. To compensate for the decrease in model accuracy caused by lightweighting, a spatial pyramid pooling mechanism based on the cross-stage partial structure is designed. The detection accuracy of the model is effectively improved by feature fusion at different scales. Finally, experiments on NEU-DET dataset and the self-built HBIS dataset demonstrate the proposed algorithm reaches a faster detection speed and higher accuracy. Moreover, the corresponding hardware and software system meets the real-time online detection requirements of the production line and it has been put into service.
2024 Vol. 37 (8): 692-702 [Abstract] ( 193 ) [HTML 1KB] [ PDF 1626KB] ( 189 )
703 Cross-Channel Feature-Enhanced Graph Convolutional Network for Skeleton-Based Action Recognition
WU Zhize, CHEN Sheng, TAN Ming, SUN Fei, YANG Jing
Traditional graph convolutional networks for skeleton-based action recognition struggle to model long-range joint relationships and long-term temporal information due to their local operation mode, failing to capture subtle variations between actions. To address this problem, a cross-channel feature-enhanced graph convolutional network(CFE-GCN) for skeleton-based action recognition is proposed including a dual part-wise grouping graph convolution(DPG-GC) module, a cross-stage partial dense connections(CS-PDC) module and a multi-scale temporal convolution(MS-TC) module. The DPG-GC module models the human body joints by a grouping strategy to extract multi-granularity features and capture the subtle local differences between the joints. The CS-PDC module establishes associations between nodes and the previous network layers, enriching the early information and capturing the potential long-term relationships between the moving joints, and thereby contextual features are learned more comprehensively. The MS-TC module performs temporal convolution with different receptive fields to capture both short-term and long-term dependencies in the temporal domain. Experiments show that CFE-GCN achieves superior performance on multiple benchmark datasets.
2024 Vol. 37 (8): 703-714 [Abstract] ( 266 ) [HTML 1KB] [ PDF 1334KB] ( 197 )
Papers and Reports
715 Narrative-Driven Large Language Model for Temporal Knowledge Graph Prediction
CHEN Juan, ZHAO Xinchao, SUI Jingyan, QI Lin, TIAN Chen, PANG Liang, FANG Jinyun
The temporal knowledge graph(TKG) is characterized by vast sparsity, and the long-tail distribution of entities leads to poor generalization in reasoning for out-of-distribution entities. Additionally, the low infrequency of historical interactions results in biased predictions for future events. Therefore, a narrative-driven large language model for TKG Prediction is proposed. The world knowledge and complex semantic reasoning capabilities of large language models are leveraged to enhance the understanding of out-of-distribution entities and the association of sparse interaction events. Firstly, a key event tree is selected based on the temporal and structural characteristics of TKG, and the most representative events are extracted through a historical event filtering strategy. Relevant historical information is summarized to reduce input data while the most important information is retained. Then, the large language model generator is fine-tuned to produce logically coherent "key event tree" narratives as unstructured input. During the generation process, special attention is paid to the causal relationships and temporal sequences of events to ensure the coherence and rationality of the generated stories. Finally, the large language model is utilized as a reasoner to infer the missing temporal entities. Experiments on three public datasets demonstrate that the proposed method effectively leverages the capabilities of large models to achieve more accurate temporal entity reasoning.
2024 Vol. 37 (8): 715-728 [Abstract] ( 178 ) [HTML 1KB] [ PDF 1033KB] ( 195 )
Researches and Applications
729 Class-Incremental Learning Method Based on Feature Space Augmented Replay and Bias Correction
SUN Xiaopeng, YU Lu, XU Changsheng
The problem of catastrophic forgetting arises when the network learns new knowledge continuously. Various incremental learning methods are proposed to solve this problem and one mainstream approach is to balance the plasticity and stability of incremental learning through storing a small amount of old data and replaying it. However, storing data from old tasks can lead to memory limitations and privacy breaches. To address this issue, a class-incremental learning method based on feature space augmented replay and bias correction is proposed to alleviate catastrophic forgetting. Firstly, the mean feature of an intermediate layer for each class is stored as its representative prototype and the low-level feature extraction network is frozen to prevent prototype drift. In the incremental learning stage, the stored prototypes are enhanced and replayed through geometric translation transformation to maintain the decision boundaries of the previous task. Secondly, bias correction is proposed to learn classification weights for each task, further correcting the problem of model classification bias towards new tasks. Experiments on four benchmark datasets show that the proposed method outperforms the state-of-the-art algorithms.
2024 Vol. 37 (8): 729-740 [Abstract] ( 114 ) [HTML 1KB] [ PDF 1031KB] ( 158 )
741 Semi-supervised Online Classification Method for Multi-label Data Stream Based on Kernel Extreme Learning Machine
WANG Yuchen, QIU Shiyuan, LI Peipei, HU Xuegang
In practical applications, a large amount of streaming data emerges, and it is characterized of high arrival speed, massive volume and dynamic variation. Moreover, the data streams often contain multiple labels but only a small amount of data in the streams is labeled, causing the problems of concept drift and label missing in the multi-label data. To solve these problems, a semi- supervised online classification method for multi-label data stream based on kernel extreme learning machine is proposed in this paper. Firstly, the data stream is divided into k blocks according to the sliding window to tackle the label missing problem in multi-label data stream. A feature similarity matrix and a label similarity matrix are constructed for each piece of data and they are added to the training of kernel extreme learning machine model. An incremental update mechanism is designed to construct a semi-supervised online kernel extreme learning machine to adapt to the characteristics of streaming data. Secondly, to address the issue of the concept drift problem in data stream, the timestamp mechanism is adopted for discarding update. The data size is preset in advance. When the data reaches the specified size, the oldest unlabeled data is discarded and new data is added for updating. Finally, experiments on 10 multi-label datasets demonstrate that the proposed method possesses strong adaptability to the problems of label missing and concept drift, while maintaining good classification performance.
2024 Vol. 37 (8): 741-754 [Abstract] ( 134 ) [HTML 1KB] [ PDF 853KB] ( 136 )
模式识别与人工智能
 

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press
 
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn