模式识别与人工智能
Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence
22 Judgement and Disposal of Academic Misconduct Article
22 Copyright Transfer Agreement
22 Proof of Confidentiality
22 Requirements for Electronic Version
More....
22 Chinese Association of Automation
22 National ResearchCenter for Intelligent Computing System
22 Institute of Intelligent Machines,Chinese Academy of Sciences
More....
 
 
2025 Vol.38 Issue.11, Published 2025-11-25

Diversified Identification and Understanding of Instantaneous Multimodality and Sequiential Visual Data   
   
Diversified Identification and Understanding of Instantaneous Multimodality and Sequiential Visual Data
961 Point Cloud Anomaly Detection Network Based on Local Perception Graph Reconstruction
GAO Chengling, HUANG Changqin, ZHENG Zhonglong, JIANG Yunliang

Point cloud anomaly detection aims to identify defective samples from the overall data distribution and further locate the abnormal regions deviating from the expected pattern in space. Existing global matching strategies struggle to effectively capture anomalies concentrated in local subtle geometric structures. To address this issue, a point cloud anomaly detection network based on local perception graph reconstruction is proposed. First, the point cloud is modeled as a graph structure, and local sensitive edge convolution operations are utilized to mine local structural features to enhance the identification ability for local subtle anomalies. Second, a local alignment reconstruction loss is designed based on a subgraph structure alignment strategy to amplify the differences between normal and abnormal samples at the structural level. Furthermore, a local anomaly simulation strategy is introduced to construct an anomaly-normal sample pair. Through this strategy, the model is constrained to learn the pattern mapping from abnormal samples to normal samples. Finally, a local matching algorithm is applied to calculate the differences between abnormal samples and expected normal samples to achieve the detection of point cloud abnormal regions. Experimental results show that the proposed method significantly enhances the perception ability for local details and achieves excellent performance on multiple categories of point cloud anomaly detection tasks.

2025 Vol. 38 (11): 961-973 [Abstract] ( 32 ) [HTML 1KB] [ PDF 2302KB] ( 20 )
974 Uncertainty-Aware Prototypical Learning Method for Cross-Domain Emotion Recognition
WANG Yilin, YE Hailiang, GUO Wenhui, CAO Feilong

Electroencephalogram(EEG)-based emotion recognition attracts increasing attention due to its broad applications in mental health monitoring and brain-computer interfaces. However, existing methods still suffer from inadequate utilization of electrode spatial information, limited labeled data, and cross-domain distribution shifts. To address these issues, an uncertainty-aware prototypical learning method for cross-domain emotion recognition is proposed. First, a position encoding-guided graph semi-supervised module is designed. The spatial topology is embedded into the adjacency matrix through incorporating sine-cosine positional encoding. Thus, the limitation of traditional graph constructions based solely on physical distance is broken through, and physiological priors of EEG signals are effectively integrated. Based on the above, the graph-based feature propagation mechanism is leveraged to collaboratively learn deep topological representations from both labeled and unlabeled samples, mitigating the challenge of label scarcity. Second, an uncertainty-aware prototype learning module is constructed to dynamically refine emotion prototypes by quantifying feature reliability. Thereby, noise and cross-domain shifts are suppressed while generalization ability of the model across different domains is enhanced. Experiments on multiple public EEG emotion datasets demonstrate that the proposed method outperforms state-of-the-art methods on cross-domain recognition tasks.

2025 Vol. 38 (11): 974-985 [Abstract] ( 23 ) [HTML 1KB] [ PDF 1337KB] ( 15 )
986 Industrial Anomaly Detection Method Based on Deviation Diffusion with Two-Stream Contrastive Learning
ZHU Sheng, ZHANG Yafei, LI Huafeng

Industrial anomaly detection is crucial to intelligent manufacturing and quality control. However, due to the scarcity and diverse distribution of anomaly samples, existing methods are prone to reconstruction distortion or feature confusion in multi-class scenarios, leading to inaccurate anomaly localization. To address these issues, an industrial anomaly detection method based on deviation diffusion and dual-stream contrastive learning is proposed. The deviation direction from noisy features to normal features in the latent space is directly learned through a deviation diffusion model. The normal structure of specific classes is captured by combining class embedding. A robust anomaly heatmap is generated through fusing the differences between the latent space domain and the image domain during inference. Simultaneously, dual-stream contrastive learning is introduced to widen the representation gap between normal samples and anomalies. Furthermore, a foreground-aware anomaly synthesis strategy is designed to construct complementary anomalies with structural and textural characteristics to provide high-quality anomaly samples for the contrastive learning of the normal flow-synthesized anomaly flow. The experimental results on MVTec-AD and VisA datasets show that the proposed method achieves excellent performance on multiple evaluation metrics and exhibits strong generalization ability.

2025 Vol. 38 (11): 986-998 [Abstract] ( 18 ) [HTML 1KB] [ PDF 2452KB] ( 16 )
999 Lightweight Single-Image Super-Resolution Network Based on Four-Path Multi-scale Attention Module and Bridging Structure
SU Bohejun, XU Yong, XUE Rui, LIU Wei, WANG Haoqian

A2F-SD, the image super-resolution network, is built on squeeze and excitation attention module. In this paper, the limitations of A2F-SD are analyzed, including the excessively simplistic attention mechanism and insufficient utilization of multi-path information. To address these issues, a lightweight single-image super-resolution network based on four-path multi-scale attention module and bridging structure(A2F-MSAB) is proposed to optimize the attention modules and hourglass structure in the original A2F-SD model. First, multi-scale self-learning average-variance attention modules are designed. A four-path attention module is employed to enhance the image feature extraction ability of the network. Then, the hourglass structure of A2F-SD is improved into the bridge and multi-level hourglass structure to facilitate information flow between different modules, and between shallow and deep layers of the model, thereby strengthening the model reconstruction ability for medium and high-frequency details. Experiments show that A2F-MSAB model with only half of the basic modules stacked achieves superior performance and its evaluation metrics on certain datasets outperform those of both A2F-SD and A2F-S.

2025 Vol. 38 (11): 999-1012 [Abstract] ( 19 ) [HTML 1KB] [ PDF 1482KB] ( 16 )
1013 Multimodal Query-Guided End-to-End Person Search
ZHOU Chun, ZHANG Yafei, WANG Hongbin

Current person search methods are predominantly limited to image-based queries and their retrieval accuracy is significantly restricted by the low quality of query images or the incomplete pedestrian features. Furthermore, mainstream methods rely on region proposal networks and non-maximum suppression to generate predefined candidate boxes, making it difficult to achieve end-to-end person search directly from a query to a panoramic gallery. Therefore, a multimodal query-guided end-to-end person search method is proposed. Textual descriptions of pedestrians are introduced as an auxiliary modality to address the limitation of relying solely on visual information. The pedestrian detection and re-identification tasks are jointly optimized within an end-to-end architecture. To enhance the semantic completeness of pedestrian representations, the differentiated semantic information between the query image and the text description is explored and more comprehensive pedestrian information is learned. Then, a cross-modal attention mechanism is utilized to enhance the pedestrian features in the gallery images corresponding to the query information to improve the discriminative ability for pedestrian features. Finally, a detection module based on Transformer is adopted. It discards the traditional region proposal networks and non-maximum suppression pipeline, and directly outputs the final person search results. Experiments on the challenging datasets demonstrate the superior performance of the proposed method.

2025 Vol. 38 (11): 1013-1026 [Abstract] ( 17 ) [HTML 1KB] [ PDF 3673KB] ( 17 )
1027 Hierarchical Transformer for Micro-Expression Recognition with Spatial-Channel Features and Graph Attention
CAO Chunping, WEI Jinxin

The existing approaches possess limitations of multi-scale feature extraction, inter-region relationship modeling, and computational efficiency. To address these issues, a hierarchical Transformer for micro-expression recognition with spatial-channel features and graph attention(HT-SCGA) is proposed. First, a multi-scale dynamic window module is designed for hierarchical extraction from local fine-grained features to global coarse-grained features through adaptively expanding receptive fields. Second, a dual-domain feature association module is introduced to enhance feature representation and reduce computational complexity by jointly modelling spatial and channel dependencies. Finally, a graph attention aggregation module is constructed to explicitly model semantic correlations among key facial regions and strengthen the coordinated representation of facial action units. Experiments on three benchmark datasets, SMIC, CASME II, and SAMM, demonstrate that HT-SCGA outperforms existing methods on UF1 and UAR metrics. These results verify the effectiveness and efficiency of HT-SCGA for micro-expression recognition.

2025 Vol. 38 (11): 1027-1040 [Abstract] ( 21 ) [HTML 1KB] [ PDF 996KB] ( 22 )
1041 Lightweight Heterogeneous Collaborative Dual-Path Bridged Network for Small-Target Detection in UAV Images
XIE Kaijing, CHEN Junying

To address the issues of highly dense small targets, severe occlusion in UAV images, and high computational complexity of existing general detection models, a lightweight heterogeneous collaborative dual-path bridged network(LHCB-Net) for small target detection in UAV images is proposed in this paper. Based on the YOLOv9 framework, the functionally homogeneous stacking and the limited receptive field of RepNCSPELAN4 are decoupled. A backbone network is constructed by combining the reparameterized convolution block RepVCBlock and the global attention large selective kernel network(Galsk). Thus, a three-dimensional attention mechanism covering channel, space and global contextual dependencies is integrated to effectively enhance the perception of small targets. Galsk combines global modeling and feedforward enhancement mechanisms to improve feature extraction in complex backgrounds and occlusion scenarios and compensate for the receptive field reduction caused by lightweight design. Moreover, secondary backbone features are directly connected to the detection head through cross-layer bridging to achieve multi-scale feature fusion and optimize localization accuracy. Additionally, a scale-adaptive IoU loss function is introduced to dynamically adjust regression weights for targets of different scales. Experimental results on VisDrone2019, UAVDT, and a self-built dataset demonstrate that LHCB-Net significantly improves detection performance for dense small targets while reducing parameters and computational cost, providing an efficient solution for real-time onboard detection. The complete code is available at: https://github.com/tson122556/LHCB-Net/tree/master.

2025 Vol. 38 (11): 1041-1054 [Abstract] ( 20 ) [HTML 1KB] [ PDF 4731KB] ( 23 )
模式识别与人工智能
 

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press
 
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn